• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Dion Almaer

Software, Development, Products

  • @dalmaer
  • LinkedIn
  • Medium
  • RSS
  • Show Search
Hide Search

Archives for February 2023

I have scissors all over my house

February 20, 2023

Midjourney hallucinates scissors and houses

I used to be the type of chap who has one place for everything. The scissors are in That Drawer in the kitchen. It kinda worked, until I had to live with other humans that I didn’t have control of.

After years of fighting against the system of “one place for every thing” I went full in on the other direction, I think inspired by reading Algorithms to Live By, and Brad Fitzpatrick’s work with Perkeep. Since then, I have taken the opposite approach to a single source of object, and instead I put items in many places, ideally where they would be used. This is why I now have scissors all over my house, or screw drivers, etc. “All over” is going a lil far, I put them in spots where I think they will be used. This is akin to a CDN… caching a copy closer to where someone needs it.

Why am I talking about scissors?

I went through the exact same kind of transition in the virtual world. Where do I keep my data? I would try to centralize it as much as possible. E.g. Google Drive as a source of truth, and then split off for types of content that weren’t a fit (e.g. I used Asana as a central database for a long time, and 1Password for passwords, and Active Recall for learning, and Type.ai for writing, etc).l

After years of trying new systems, and migrating data, I took the other approach:

  • I embrace the fact that one system won’t be perfect, for now, and especially the future
  • I don’t worry about migrating data as I try new things. I jumped around with Roam, Obsidian, Logseq as an example. Once a lil more settled, I may then do some migration
  • I favor products where I can get to the data (yay owning your data)
  • I favor products where there are strong integrations. Instead of a central merge, I can then connect all of the things and have the data show up in all of the places

But how about finding things? Integrating into one search to rule them all is vital when you have data in various spots. I’m hopeful for products such as Needl (I need more integrations and a SDK to integrate with). My latest foray here is, importing all of my second brain into my own local Polymath with access control.

Now I can use natural language to get semantic search results, each with links that poke me to where the data is. Often it’s in multiple spots so I can choose what I want to open up to see and use that data.

There is so much opportunity for us to get to a collective, with integrations, and allows us to evolve and connect our data.

GenAI: Lessons working with LLMs

February 14, 2023

Creativity & Constraints, Foundations & Flywheels

The developer community is buzzing around the new world of LLMs. Roadmaps for the year are getting ripped up one month in, and there is a whole lot of tinkering… and I love the smell of tinkering.

At Shopify we shared a new Winter Edition, which packaged up 100+ features for merchants and developers. Some of the launches had a lil Shopify Magic in them, using LLMs to make life better for our users.

I had a lot of fun, shipping something for developers that used LLMs, and I thought I would write about a few things that I learned going through the process of getting to shipping.

UI for mock.shop
The mock.shop homepage

What did we ship? mock.shop

We want to make it as easy as possible for developers to learn and explore commerce, by playing. We wanted to take as much friction as possible from being able to explore a commerce data model, and build a custom frontend to show off your frontend.

This is where mock.shop comes in, it sits in front of a Shopify store, but doesn’t require you to create one yourself. Just start playing with it and hitting it directly!

One thing we have heard from some developers is that they are new to GraphQL and/or new to the particulars of the commerce domain. We show examples, and the GraphQL and code examples of how to work with it, but could we go even further?

Gil seeing mock.shop

Generate query with AI

What if you could just use your words and ask us to generate the GraphQL for you? That’s exactly what we did. And here’s what we learned…

Foundations & Flywheels

We used OpenAI for this work, and when working with LLMs you are working with a black box. While GPT3 had some knowledge of GraphQL, and Shopify, it’s knowledge was out dated and often wrong. Out of the box you are working with anything that the model has sucked up, and you can’t trust this data at all.

You need to do all you can to feed the black box information so that it can come up with the best results. Given the black box, you will need to experiment and keep poking it to see if you are making it better or worse.

Here are some of the foundational things that we did:

Feed it the best input

Gather all of information that you think will nudge the model in the right direction. In our case we gathered the GraphQL schema (SDL) for the Shopify storefront APIs, and then a bunch of good examples. With these in hand, we would chunk them up and create OpenAI embeddings from them. You end up with a library of these embeddings, which are vectors that represent the chunks of text.

With these embeddings we can take user queries (eg. “Get me 7 of the most recent products”), get an embedding from that query, and then look for similar embeddings from the library that you have created. Those will contain snippets such as the schema for the products GraphQL section, and some of the good examples that work with products. We call this context and you will pass that to the OpenAI completions endpoint as part of a prompt.

Customize the prompt

You will want to play with prompts that result in the right kind of output for your use case. In our case we are looking for the black box to not just start completing with sentences, but rather give back valid GraphQL.

You end up with a prompt such as:Answer the question as truthfully as possible using the provided context, and if don’t have the answer, say “I don’t know”.\nContext:\n${context}\n\nQuestion:\nWhat is a Shopify GraphQL query, formatted with tabs, for: ${query}\n\nAnswer:

You can see how the prompt is:

  • Politely asking for the answer to be truthful
  • Nudging for the answer to be tied to the given context (from the embeddings) vs. making it up from full cloth, and saying that it’s ok to say “I don’t know”!
  • Asking for a formatted GraphQL query

One other way that we try to stop any hallucinating from the model is via setting the temperature to 0 when we make the completion call:What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It’s quite funny to see how we do everything to try to get the model to speak the truth with this type of use case!

Feedback and Flywheels

Now it’s time for the flywheels to kick in. You want to keep feeding the context with high quality examples, sometimes show what NOT to do, play with different prompts, and start getting feedback.

You will see lots of examples where users are asked for feedback. E.g. in support systems and documentation: did this help? is it accurate? To train the model as best as possible, you can look for ways to get this information from the experts (humans!) and feed it on back, as well as simply tracking what your users are asking for and how well you are acting on those needs!

Creativity & Constraints

We have the foundations in place, and the quality of data will improve through the flywheels. Now it’s time to get more constrained. We are doing all we can to nudge for truth, but you can’t trust these things, so what guardrails should you put in place?

We really want the GraphQL that we show to be valid, so… how about we do some validation?

We take the GraphQL that comes back and we can do a couple things:

  • We would tweak it, when possible, to place valid IDs and content, for the given dataset that we have in the mock.shop instance.
  • Validate the GraphQL to make sure the syntax is correct
  • Run it against the mock.shop, since we have real IDs, and show the results to the user!

You can’t assume anything, so you often will have to have a guard step once you get results.

ChatGPT vs. Stockfish

There was a lot of hubbub when someone pit ChatGPT vs. Stockfish in a game of chess. Many used it as a way to laugh at ChatGPT. This thing is crazy! It did all kinds of invalid moves! No doy! You have to assume that and build systems to tame it… a chess engine wouldn’t allow invalid moves.

Defensive

You have to be incredibly defensive. You are poking a brain with electrodes. It comes out with amazing things, but you can’t trust everything that comes back. Making remote calls to OpenAI itself is flaky, and often goes down.

Now only will you be checking for timeouts and errors in results, but you should consider a feature flag toggle. In the case of mock.shop, the tool is usable without any of the AI features. They are progressive enhancements to the product.

We can add checks to automatically turn it off if something really bad is happening with OpenAI. Marry both:

const openAIStatusRequest = fetch("https://status.openai.com/api/v2/status.json");

and check the results for the type of incident:

openAIStatus.status.indicator === "major"

It’s incredibly fun, getting creative with how you can use the power of LLMs, which are getting better and faster all the time. The black box nature can be frustrating at times, but it’s worth it.

I hope you are having some fun tinkering!


https://polymath.almaer.com/

There are so many helpful libraries out there. I have been working with some friends on Polymath to make it simple to import and create the libraries, as well as query it all.

Primary Sidebar

Twitter

My Tweets

Recent Posts

  • I have scissors all over my house
  • GenAI: Lessons working with LLMs
  • Generative AI: It’s Time to Get Into First Gear
  • Developer Docs + GenAI = ❤️
  • We keep confusing efficacy for effectiveness

Follow

  • LinkedIn
  • Medium
  • RSS
  • Twitter

Tags

3d Touch 2016 Active Recall Adaptive Design Agile Amazon Echo Android Android Development Apple Application Apps Artificial Intelligence Autocorrect blog Bots Brain Calendar Career Advice Cloud Computing Coding Cognitive Bias Commerce Communication Companies Conference Consciousness Cooking Cricket Cross Platform Deadline Delivery Design Desktop Developer Advocacy Developer Experience Developer Platform Developer Productivity Developer Relations Developers Developer Tools Development Distributed Teams Documentation DX Ecosystem Education Energy Engineering Engineering Mangement Entrepreneurship Exercise Family Fitness Founders Future GenAI Gender Equality Google Google Developer Google IO Habits Health HR Integrations JavaScript Jobs Jquery Kids Stories Kotlin Language Leadership Learning Lottery Machine Learning Management Messaging Metrics Micro Learning Microservices Microsoft Mobile Mobile App Development Mobile Apps Mobile Web Moving On NPM Open Source Organization Organization Design Pair Programming Paren Parenting Path Performance Platform Platform Thinking Politics Product Design Product Development Productivity Product Management Product Metrics Programming Progress Progressive Enhancement Progressive Web App Project Management Psychology Push Notifications pwa QA Rails React Reactive Remix Remote Working Resilience Ruby on Rails Screentime Self Improvement Service Worker Sharing Economy Shipping Shopify Short Story Silicon Valley Slack Software Software Development Spaced Repetition Speaking Startup Steve Jobs Study Teaching Team Building Tech Tech Ecosystems Technical Writing Technology Tools Transportation TV Series Twitter Typescript Uber UI Unknown User Experience User Testing UX vitals Voice Walmart Web Web Components Web Development Web Extensions Web Frameworks Web Performance Web Platform WWDC Yarn

Subscribe via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Archives

  • February 2023
  • January 2023
  • September 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • November 2021
  • August 2021
  • July 2021
  • February 2021
  • January 2021
  • May 2020
  • April 2020
  • October 2019
  • August 2019
  • July 2019
  • June 2019
  • April 2019
  • March 2019
  • January 2019
  • October 2018
  • August 2018
  • July 2018
  • May 2018
  • February 2018
  • December 2017
  • November 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012

Search

Subscribe

RSS feed RSS - Posts

The right thing to do, is the right thing to do.

The right thing to do, is the right thing to do.

Dion Almaer

Copyright © 2023 · Log in

 

Loading Comments...