Built for šāLinkedInāBlueskyāThreads. Powered by AI
Write & schedule, effortlessly
Craft and publish engaging content in an app built for creators.
NEW
Publish anywhere
Publish on X, LinkedIn, Bluesky, Threads, & Mastodon at the same time.
Make it punchier š
Typefully
@typefully
We're launching a Command Bar today with great commands and features.
AI ideas and rewrites
Get suggestions, tweet ideas, and rewrites powered by AI.
Turn your tweets & threads into a social blog
Give your content new life with our beautiful, sharable pages. Make it go viral on other platforms too.
+14
Followers
Powerful analytics to grow faster
Easily track your engagement analytics to improve your content and grow faster.
Build in public
Share a recent learning with your followers.
Create engagement
Pose a thought-provoking question.
Never run out of ideas
Get prompts and ideas whenever you write - with examples of popular tweets.
@aaditsh
I think this thread hook could be improved.
@frankdilo
On it š„
Share drafts & leave comments
Write with your teammates and get feedback with comments.
NEW
Easlo
@heyeaslo
Reply with "Notion" to get early access to my new template.
Jaga
@kandros5591
Notion š
DM Sent
Create giveaways with Auto-DMs
Send DMs automatically based on engagement with your tweets.
And much more:
Auto-Split Text in Posts
Thread Finisher
Tweet Numbering
Pin Drafts
Connect Multiple Accounts
Automatic Backups
Dark Mode
Keyboard Shortcuts
Creators loveĀ Typefully
180,000+ creators andĀ teams chose Typefully to curate their Twitter presence.
Marc Kƶhlbrugge@marckohlbrugge
Tweeting more with @typefully these days.
š Distraction-free
āļø Write-only Twitter
š§µ Effortless threads
š Actionable metrics
I recommend giving it a shot.
Jurre Houtkamp@jurrehoutkamp
Typefully is fantastic and way too cheap for what you get.
Weāve tried many alternatives at @framer but nothing beats it. If youāre still tweeting from Twitter youāre wasting time.
DHH@dhh
This is my new go-to writing environment for Twitter threads.
They've built something wonderfully simple and distraction free with Typefully š
Santiago@svpino
For 24 months, I tried almost a dozen Twitter scheduling tools.
Then I found @typefully, and I've been using it for seven months straight.
When it comes down to the experience of scheduling and long-form content writing, Typefully is in a league of its own.
After trying literally all the major Twitter scheduling tools, I settled with @typefully.
Killer feature to me is the native image editor ā unique and super useful š
Visual Theory@visualtheory_
Really impressed by the way @typefully has simplified my Twitter writing + scheduling/publishing experience.
Beautiful user experience.
0 friction.
Simplicity is the ultimate sophistication.
Queue your content inĀ seconds
Write, schedule and boost your tweets - withĀ noĀ need forĀ extra apps.
Schedule with one click
Queue your post with a single click - or pick a time manually.
Pick the perfect time
Time each post to perfection with Typefully's performance analytics.
Boost your content
Retweet and plug your posts for automated engagement.
Start creating a content queue.
Write once, publish everywhere
We natively support multiple platforms, so that you can expand your reach easily.
Check the analytics thatĀ matter
Build your audience with insights that makeĀ sense.
Writing prompts & personalized postĀ ideas
Break through writer's block with great ideas and suggestions.
Never run out of ideas
Enjoy daily prompts and ideas to inspire your writing.
Use AI for personalized suggestions
Get inspiration from ideas based on your own past tweets.
Flick through topics
Or skim through curated collections of trending tweets for each topic.
Write, edit, and track tweetsĀ together
Write and publish with your teammates andĀ friends.
Share your drafts
Brainstorm and bounce ideas with your teammates.
NEW
@aaditsh
I think this thread hook could be improved.
@frankdilo
On it š„
Add comments
Get feedback from coworkers before you hit publish.
Read, Write, Publish
Read, WriteRead
Control user access
Decide who can view, edit, or publish your drafts.
Watching @karpathy presentation from today and taking twitter notes, come along for the ride:
If you're like only the practical tips, skip to #32
@karpathy starts with stages:
1 - Pre-training - months x thousands of GPUs
2, 3, 4 - Finetuning stages that take hours or days
Before pre-training happens, there are 2 preparation steps.
Data collection - Get tons of data from different sources (here Andrej LLaMa mixture)
Tokenization - a lossless translations between pieces of words and integers.
"You shouldn't judge the power of the model just by the number of parameters it contains"
LLaMa has trained on 1-1.4 Trillion tokens vs 300B tokens in GPT-3.
"I don't have enough time to go into how transformers work unfortunately" š Gotta love Andrej thirst for teaching!
I cannot summarize this into a tweet tbh.
Here's an example from NYT who trained a GPT model on Shakespeare
You can see continued improved after many iterations of how the LM is getting better at predicting what next word would come in a Shakespeare text.
Ok STRONGLY paraphrasing here but, every iteration, the trainee model tries to predict which token/integer would come next after the green one (in image) and this is outlined by the Training curve, how well does is it able to predict the next tokens compared the original text
Around GPT-2, the industry noticed that if we structure out prompts in a specific way, and provide a few examples (Few Shot prompting) then the base model will be "tricked" into autocompleting what instructions we provided it in prompt.
Andrej repeats this several times, the best open source model to learn from right now is probably LLaMa from @MetaAI (since OAI didn't release anything about GPT-4)
GPT-2 - released + weights
GPT-3 - base model available via API (da-vinci)
GPT-4 - Not Available via API
Base models are not assistants, they don't "do what you ask them" in the basic sense. They just autocomplete text.
But if you structure your document with Few-shot prompts, it will "trick" the base model to think that it autocompletes a chat between an AI and a human
But this trick is not enough. So we're moving to step 2.
Supervised Finetuning.
Collecting small but high quality (think human contractors) datasets of instructions
And continue training the model with a swapped dataset now and we get the SFT (supervised finetuning) model.
SFT model is... not great yet, definitely not chatGPT quality. So the training continues
Generating outputs of questions with the SFT model, users review and compare between 3 versions & rank which was the best, and then the model is retrained on the selections by the users
This is done by wighting the better voted on responses. For example, when you hit š or š in chatGPT, or choose to regenerate a response, those signals are great for RLHF.
Andrej is going into the potential reasons of why RLHF models "feel" better to us. At least in terms being a good assistant.
Here again if anyone's still reading, I'll refer you to the video š
Interestingly, Andrej talks about RLHF are not strictly improvements on base models. RLHF models have less enthropy so it is less "inventive" potentially.
For that base models are still better because they are still chaotic.
This is the current state of models as ranked by folks from Berkley based on ranking.
Interestingly here, @karpathy here says that GPT-4 is the best "by far", but on the chart its 1274 to Claude's 1224 ELO rating that doesn't seem "by far"
Imsys.org/blog/2023-05-10-leaderboard/
RLHF models are better ranked, all the top 3 are RLHF models and the rest (to his knowledge are SFT models)
Wohoo! We're through the first half of the talk. Moving to Application of these models to problems.
Andrej then goes fairly in depth into the difference between a human being process of writing a statement like
"California's population is 53 times that of Alaska"
A human brain goes through loops, fact checks, calculation, reflection.
While a GPT is trying to autocomplete, there is no internal dialog in GPT.
It spends the same amount of "compute" per token, no matter if the token is a number it needs to look up or a fact it needs to check, but they have vast knowledge and perfect memory (context window)
Methods like Chain of thought provide models with "more tokens" or "more time to think" by asking "let's think step by step"
Which will make the model to show it's work, and this will give it "time to think" for a better answer
Now Andrej is going into Self Reflection as a method.
Models can get "stuck" because they have no way to cancel what tokens they already sampled.
Imagine yourself saying the wrong word and stopping yourself in the middle "let me rephrase" and you re-start the sentence
Models don't have that luxury so they can get stuck down that wrong path...
But examples like self-reflection show that asking the model to review it's output, judge it, gives models a "second change" or another pass over the reasoning of the output which improves results!
I love it, Andrej uses the Thinking Fast and Slow - system 1 and system 2 models of our thinking to LLMs.
These techniques like CoT, Self Reflexion and the recently released Tree of thought are our attempt to build system 2, the slower, more deliberate thinking
š analogy.
Andrej also calls out #AutoGPT ( by @SigGravitas ) as a project that got overhyped but is still very interesting to observe and get inspiration from
I'll plug in my twitter list of "Agent" builders that includes many of those folks
twitter.com/i/lists/1642934512836575232
But Andrej doesn't think this currently works very well for production. But folks should "watch this space"
Moving on:
"LLM's don't WANT to succeed, a human wants to"
Transformers work better when asked to work better.
Ok this next slide, I made almost verbatim the same one in my presentation 3 days ago! Haha, impostor syndrome begone
Watch the plugin space, as providing the models with plugins/tools like calculator, code interpreter, search etc'
Remember, bing is coming to chatGPT!
"Context window of the transformer is it's working memory"
The model has immediate perfect access to it's working memory.
Andrej calling out @gpt_index by @jerryjliu0 on stage as an example of a way to "load" information into this perfect recall working memory.
Yay, he's covering the guidance project from Microsoft, that constraints the prompt outputs.
github.com/microsoft/guidance/
On to Finetuning - Prompt engineering can only take you so far. (could be really far)
Fine-tuning changes the weights of the models. Works for smaller and open source models.
With methods like LoRa allow you to only train small pieces of the large model which reduces costs
This is way more efficient than retraining the whole model, and is more available.
Andrej again calls out LlaMa as the best open source fine-tuneable model and is hinting at @ylecun to open source it for commercial use š š
If you'd like @karpathy's practical examples - start from here š
This is the de-facto "kitchen sink" for building a product with an LLM goal/task.
"Use GPT-4 he says, it's by far the best."
I personally noticed Claude being very good at certain tasks, and it's way faster for comparable tasks so, y'know if you have access, I say evaluate. But he's not wrong, GPT-4 is... basically amazing.
Can't wait for Vision š
twitter.com/altryne/status/1658357541846552582
"What would you tell a task contractor if they can't email you back" is a good yard stick at a complex prompt by Andrej.
From me: for example, see wolfram alpha's prompt
twitter.com/dmvaldman/status/1658689854056853504
Retrieve and add any relevant context or information to the prompt.
And shove as many examples of how you expect the results to look like in the prompting.
Me: Here tools like @LangChainAI and @trychroma come into play, use them to enrich your prompts.
Experiment with tools/plugins to offload tasks like calculation, code execution.
Andrej also suggest first achieving your task, and only then optimize for cost.
Me: removing the constraint of "but this will be very costly" definitely helps with prompting if you can afford
If you've maxed out prompting, and he again repeats, prompting can take you very far, then your company can decide to move to fine-tuning and RLHF on your own data.
Optimizing for costs :
- Use lower quality models if they execute on your specific tasks
- Gradually reduce number of tokens of your prompt while testing the output to reduce costs.
Models may be biased
Models may fabricate ("hallucinate") information
Models may have reasoning errors
Models may struggle in classes of applications, e.g. spelling related tasks
Models have knowledge cutoffs (e.g. September 2021)
Models are susceptible to prompt injection
So for may 2023 - per Andrej Karpahy, use LLMs for these tasks:
ā Use in low-stakes applications, combine with human oversight
ā Source of inspiration, suggestions
ā Copilots over autonomous agents
Finally, Andrej concludes with an example of how easy it is to ask for a completion, and with GPT-4 generated address to the audience of #microsoftBuild which he reads in a very TED like cadence to the applauds from the audience!
Thanks @karpathy
And yeah, I made this thread as I was watching, if you like these or my bling reactions, y'know... follow me @altryne :) and @karpathy of course duh.