How we built a GPT-3 powered spreadsheet formula assistant

Hjalmar Gislason
GRID
Published in
6 min readFeb 16, 2023

--

Earlier this week, we launched GRID 2.0, a major update that marks our full transition from a secondary tool on top of traditional spreadsheets, to a full-blown, all-in-one numbers tool — making good on a plan we announced early 2022.

While it is just one of the many innovations we’re introducing in GRID to make numbers work easier for everyone, we highlighted our built-in formula assistant as we knew it would turn some heads. It allows our users to write a description of the calculation they’re looking to do and receive a suggested formula in return. You can see it in action in the video below or simply sign up for free and try it for yourselves.

The user interface

We decided to make using the formula assistant as natural a part of spreadsheet editing as possible, so the user interface is built directly into the sheet, accessible either in the formula bar or directly inside the cell itself, triggered by typing a double-slash: //

You’ll then be instructed to describe the formula you need in plain English (fun fact: it works pretty well in other languages too!)

By hitting Enter or simply pausing, you’ll trigger a spinning animation transitioning between the double-slash and an equals sign:

A second or two later it returns a suggested formula. Seeing the suggestion, you can then choose to accept it, refine your natural language description or edit the formula before submitting it to the cell.

Or as we like to say now: // is the new =

Does it really work?

Yes, it works incredibly well! Our primary intention was to help less experienced modelers create relatively simple formulas, something like:

//second highest value in column B
=LARGE(B:B, 2)
//is the sum of A2 to A10 larger than 100
=SUM(A2:A10)>100

But to our surprise, it turned out to be capable of coming up with fairly sophisticated formulas as solutions to problems where it might take more advanced spreadsheet modelers a while to remember the functions and figure them out:

//value in column b where name in column A is Michael
=INDEX(B:B,MATCH(“Michael”,A:A,0))
//order A2:D100 by column C alphabetically
=SORT(A2:D100, 3, 1)

Even more complex descriptions can return neat solutions:

//if the number of values in column C that are higher than 5 is lower than 10 return “more than 10 students passed” otherwise “not even 10 students passed”
=IF(COUNTIF(C:C,”>5")<10,”Not even 10 students passed”,”More than 10 students passed”)

Ok, but when does it fail

One of the challenges here is to make it clear that this is an “assistant”, not an “expert”. When you have an assistant, it is your role to verify their work, and ultimately you are responsible for the result whether you accept, reject or correct the suggestion.

Having only been out there for a couple of days, we don’t have enough statistics to draw conclusions, but early testing indicates that for descriptions with a fairly straightforward solution, the suggestions are correct about 90% of the time. The more complex the descriptions are, the more likely it is that the suggestion will not be correct. However, it is often close enough that minor edits will get it over the finish line, often hinting at solutions that the modeler might not have thought of otherwise.

As one of our early access testers said:

I’ve already learned a ton of new spreadsheet formulas and ways to build them by testing it for a few days

An interesting thing we’ve noticed is that as soon as people see how “smart” the assistant is, they expect it to understand the spreadsheet in a way that they would never expect of a spreadsheet formula, commonly referring to context in the spreadsheet such as:

//the name of the student that has the highest grade
[results will vary and certainly not be correct]

The assistant doesn’t have context for “student” or “grade”. It will likely come back with a decent formula. But without the ranges or columns the student names or grades are in, the assistant has no chance of getting it right. However, it works beautifully if this context is provided:

//the name of the student in column A that has the highest grade in column D
=INDEX(A:A,MATCH(MAX(D:D),D:D,0))

You can even set the context first and then ask the question as before:

//student names are in column A, grades are in column D, what’s the name of the student with the highest grade?
=INDEX(A:A,MATCH(MAX(D:D),D:D,0))

Similarly, people expect the assistant to know the selections in the sheet, trying things like:

//sum of the 5 cells above this one
[results will vary and certainly not be correct]

While you can give the context and it will likely do the right thing:

//cell B20 is selected, what’s the sum of the 10 cells above it
=SUM(B10:B19)

Seeing these expectations, we are experimenting with ingesting more context into the natural language queries when prompting the model.

Which brings us to the heart of the matter…

How does this work?

The formula assistant is built on top of OpenAI’s API, using the text-davinci-003 model.

As with many of OpenAI’s other uses, the real trick is generating a prompt that returns good and relatively reliable results. This took quite a lot of experimentation, especially as we are looping the formula description written by the user into a larger prompt to get the right kind of results. This happens in pre-processing where the prompt is built dynamically. There are already some smarts here, but as mentioned above, we see opportunities to do a lot more.

When the results come back from the OpenAI API, it goes through post-processing to clean up and in other ways “sanitize” the response. In case the response didn’t include a formula at all or threw an error, we pass that as a notice back to the user. Otherwise the formula gets sent to an “analyze-and-fix” process that uses our powerful browser-based spreadsheet engine to understand if the formula is valid and fully supported. This process can rewrite the formula to fix simple syntax errors or in other ways make it valid for use in GRID. Currently we are quite lenient in what we pass through, giving the user the option to correct the formula even if the analyze process determines that it isn’t 100% valid. We’ll likely cut down on that as we further improve the fixes so that the suggestions are always valid formulas.

Note that “valid” doesn’t mean they are the correct solutions to the natural language description the user provided. It is still the user’s responsibility to verify.

GPT-3 is an amazing technology to which the formula assistant obviously owes its existence.

However it is the prompt engineering plus the pre- and post-processing GRID does that takes the assistant to the next level. And we’re just getting started!

Go ahead and give GRID and the formula assistant a spin. We’re eager to hear your feedback.

--

--

Founder and CEO of GRID (@grid_hq) — the future of numbers. Proud data nerd. Curious about everything. Founder of 5 software companies.