Skip to Content Skip to Navigation
Login

Profile image for Simon Willison Simon Willison @simon@simonwillison.net

Open source developer building tools to help journalists, archivists, librarians and others analyze, explore and publish their data. datasette.io and many other .

This is a member of another server. See their original profile ➔

@paul Oh I like this quote a lot:

> Software production has been too complex and expensive for too long, which has caused us to underproduce software for decades, resulting in immense, society-wide technical debt.

@dmitry Yeah for my experiments with ReAct I've stuck with the ChatGPT 3.5 API, just because it's so inexpensive

@dmitry I've actually not hit that limit yet - most of my conversations with it are only two or three messages long, I'm quite good at prompts that return a lot of information for a little bit of input

"Give me 40 ideas for Datasette plugins that use AI" kind of thing

@LDJ what's your opinion on Adobe Firefly, which was trained exclusively on public domain and licensed images? simonwillison.net/2023/Mar/21/

@pjbrunet @securopean I find that once you get good at prompting it the code quality produced by GPT is actually really good - cleanly designed, well commented - and if you don't like something it's done you can often fix it with a tiny follow-up prompt (or edit it yourself)

@paul I'm really excited about the vision for LLM-assisted end-user programming laid out in this post: www.geoffreylitt.com/2023/03/2

@pjbrunet that was my first idea, but I wanted the raw markdown - and translating from HTML in the DOM back to markdown again felt less easy than intercepting the JSON directly

@matt it really doesn't

When I'm using it for "production" code I only ever generate code that I could have written myself if I'd put the effort in

I sometimes see suspicious chunks when I'm asking it more general research questions ("show me how transformer models work with detailed python examples") but I don't copy those into my own codebase

@dmitry mostly 4 these days

I wrote about how AI-enhanced development makes me more ambitious with my projects simonwillison.net/2023/Mar/27/

Here's a useful ChatGPT prompt:

> ... paste in some JavaScript/jQuery code ...
>
> Rewrite this to use vanilla JavaScript, no jQuery

Just used that to port some older customization code to the latest Datasette documentation theme: github.com/simonw/datasette/is

@numist if you pay them you can send prompts to their Discord bot via private message instead

@numist it's very weird! You have to join their Discord and prompt it in one of the public newbies channels

@lewiscowles1986 yeah jq is a petty terrible tool! I'm using it so much because it's available in the environments I care about and it solves a need for me: incorporate JSON into command-line pipelines

Before ChatGPT I hardly used it at all

But now... my tolerance for weird DSLs is suddenly significantly higher

@lewiscowles1986 yeah, I love it for jq - I didn't used to use that tool at all because I found the syntax too difficult to remember but now I use it several times a week

@lurker it looks to me more like they've expanded that existing feature that shows extracts of the top matching web page, except that's no good for "current time" because it's from their last crawl

@michael oops! Mixed up New York and London in the Apple world clock view... edited my toot now!

Looks like Google broke the "time in London" feature - I just got this saying the answer is 9:39pm, but the correct answer is actually 3:10pm

Edited 1d ago

A quick warning about Midjourney v5: it really can produce very convincing photorealistic output now

So maybe don't prompt it with "chicken made of human teeth --v 5"

It would honestly be unethical for me to share my results with the world, even behind a content warning

Edited 1d ago

@jesse I'd be surprised if there were - my understanding is that each token output involves calculations against all 175B parameters (depending on model size) so storing those intermediary calculations would be TBs of data

@therealfitz don't fall into the trap of asking a language model why it made a previous decision - they don't persist any of the calculations that lead to their previous messages, so all they can do is hallucinate a convincing sounding rationale based on re-inspecting the previous text in the conversation

@smy20011 not if you create it as a secret gist - it's available to people who know the URL but it's served with a meta robots tag that blocks it from being indexed

In case anyone's interested, here's the full transcript of my ChatGPT session that helped me figure this out - it's longer than I thought it would be!

(I share ChatGPT transcripts in private GitHub gists to avoid them being indexed by search engines, trying to avoid adding to AI generated text pollution in the world)

gist.github.com/simonw/c3b486f

Edited 1d ago

@acdha I also decided I wanted something without any extra dependencies if possible (I want to use it in GitHub Actions without setting up pip caching)

@acdha I could remember that tool existed but couldn't remember its name to find it!