OpenAI's Developer Day for Non Developers
You've likely heard that OpenAI announced a lot of things at their developer conference. But what does it mean if you're not a developer? Here's my key takeaways.
You've likely heard that OpenAI announced a lot of things at their developer conference. But what does it mean if you're not a developer? Here's my key takeaways.
Bigger, Better, Faster & Cheaper - The main GPT models (3.5 and 4) got big upgrades in terms of speed increases and price reductions. Additionally, you can now send about 300 pages of text into a prompt and generate up to about a 12 page response. Which means that GPT won't write a book for you, but it could theoretically write the book summary in one go.
Assistants Framework - OpenAI solved for some of the challenges with building Conversational Assistants that are as good as ChatGPT and can also leverage your own information in its responses. You can upload word docs, pdfs, etc. and it builds a virtual knowledge base for you and maintains the context of your conversation. There is no interface for this yet, but companies like Xaqt are addressing it quickly.
Code Interpreter - This has been a part of ChatGPT premium for awhile now and is one my favorite tools ever🔥 While it can generate and execute python, to me its like having a junior data analyst on demand. It can analyze spreadsheets and other data sets using natural language prompts. We're looking at how to integrate this into our analytics suite now.
More Geeky Goodies - You've likely heard developers talking about "JSON Mode", which simply means that GPT outputs can be consistently interpreted by other software and pushed out to databases a lot easier. We've all seen the ChatGPT responses "sure, I'm happy to help you with that...", which tends to not play nice with other applications. That should no longer be an issue.
Visual Input & Output- Similar to what ChatGPT rolled out, images can be posted to the API enabling use cases like caption generation and visual analysis. I'm pretty hyped for this in CX industry, as it can help with remote diagnostics or even extracting text from images. Dalle 3 can now generate images via API as well.
AI Voices - OpenAI entered the Text-to-Speech market by announcing six new AI generated voices, and they're pretty cool. You'll likely be hearing them integrated into more places.
Speech-to-Text - While light on details, they did announce an update to their Whisper Speech to Text model. Their existing API has challenges with Price Point, Speed, constraints on size of the audio file it can support, lack of real-time streaming capabilities, and speaker diarization. I'm hoping that some or all of these are addressed.
With these advancements, AI is poised to become more pervasive across various platforms and applications.
The focus on developers indicates a strategic move towards integrating their technology within a wider range of products, signaling that their offerings extend beyond ChatGPT, which appears to serve as a testing ground for new ideas and features.