Inference Tokens in Wholesale

———-

——————–
### Synthflow: Build AI voice assistants to manage inbound and outbound calls

View image: (https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/99fa4f45-408e-41c1-81d8-a375404eccb3/Hero_Image_from_Drive.png?t=1724941281)
Follow image link: (https://synthflow.ai/?utm_campaign=JHL0VVEUDT&utm_source=beehiiv&utm_medium=email&utm_content=a&_bhiiv=opp_8488ec4f-a4f7-4cda-8773-c063ed39259f_f4d5641a)
Caption:

Keep your business on 24/7 with genAI. [Synthflow’s](https://synthflow.ai/?utm_campaign=JHL0VVEUDT&utm_source=beehiiv&utm_medium=email&utm_content=a&_bhiiv=opp_8488ec4f-a4f7-4cda-8773-c063ed39259f_f4d5641a) simple no-code builder lets you set up human-sounding AI voice assistants that can handle call center tasks: real-time appointment booking, lead qualification, handling FAQ, transferring between agents, and more. White label included. Pay as low as $0.08 per minute of conversation. CRM Integrations with Hubspot, Gohighlevel, Zoho, etc. Start for free or let us build your AI receptionist.

[Try AI Phone Calling](https://synthflow.ai/?utm_campaign=JHL0VVEUDT&utm_source=beehiiv&utm_medium=email&utm_content=a&_bhiiv=opp_8488ec4f-a4f7-4cda-8773-c063ed39259f_f4d5641a)

———-

———-
Today’s top AI Highlights:

1. **Llama 3.1 inference 50-90% cheaper with LLM Inference wholesaler**

2. **You don’t need a Supercomputer to train AI models**

3. **Apple Intelligence is out in public beta**

4. **Build multi-agent AI applications with no-code**

5. **Gradio AI Agent to build, deploy and optimize Gradio apps without a single code**

& so much more!

_**Read time: 3 mins**_

———-

———-
## **AI Tutorials **

———-

———-
Gemini with Gmail looks great! But is it worth $20 a month?

In just 30 lines of Python, you can build an AI assistant that connects with your Gmail inbox, retrieves email content, and answers questions about your emails using RAG.

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. [If you’re serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.](https://www.theunwindai.com/subscribe)

🎁 **Bonus worth $50 **💵
Share this newsletter on your social channels and tag Unwind AI ([X](https://x.com/_unwind_ai), [LinkedIn](https://www.linkedin.com/company/unwind-ai?utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind), [Threads](https://www.threads.net/@unwind_ai), [Facebook](https://www.facebook.com/profile.php?id=61561355694033&utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)) to get an AI resource pack worth $50 for FREE. Valid for a limited time only!

Build RAG App to Chat with your Gmail Inbox: (https://www.theunwindai.com/p/build-rag-app-to-chat-with-your-gmail-inbox)

Subscribe now for FREE – To access future LLM, RAG & AI Agent tutorials (https://www.theunwindai.com/subscribe)

———-

———-
## **Latest Developments **

———-

———-
### [**“LLM inference is a new form of commodity”**](https://www.inference.net/)** 💰**

A new service called [inference.net](https://inference.net) is providing a cost-effective way for you to access LLM inference. Calling themselves “a wholesaler for LLM inference tokens”, they provide access to models like Llama 3.1 (diffusion models coming soon) via both batch and streaming APIs. They claim to offer prices that are significantly lower (50-90%) compared to established providers like [Together.ai](https://Together.ai) and Groq.

**Key Highlights:**

1. **How they do it -** Data centers have underutilized capacity that most orchestration software are not capable of using. inference.net uses custom scheduling software to capture small, unused time slots across data centers, turning these previously unsellable compute fragments into valuable AI inference time.

2. **Cost efficient -** Provides up to 100 billion tokens per day for LLM inference at a 50-90% discount compared to other providers.

3. **Fast inference -** Offers 100 tokens per second throughput, ensuring fast and scalable performance.

4. **Uptime -** Maintains 99.9% uptime across multiple data centers, primarily located in North America and Europe.

5. **Apply for API -** Fill out the form [here](https://forms.gle/2eKNTsyeamRbb9XBA) to receive an API key. They also have a grant program for researchers working on projects that require large amounts of batch inference, where jobs can be queued and executed asynchronously in the background.

———-

———-
### [**Train AI Models on Consumer-Grade Hardware and Internet**](https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf)** **🧑‍💻

View image: (https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/8f4770f2-4363-4f28-8ace-0b146417c994/image.png?t=1727056374)
Caption:

If training AI models feels out of reach due to bandwidth and expensive hardware, here’s the solution! Nous Research has released a report on their **DisTrO** family of distributed optimizers that could dramatically reduce the bandwidth needed to train LLMs and diffusion models across multiple GPUs, even over slower internet connections.

DisTrO cuts bandwidth requirements for multi-GPU training by 100s of times, meaning you can pre-train and fine-tune large models using standard internet connections on consumer-grade hardware.

**Key Highlights:**

1. **Bandwidth efficiency -** DisTrO reduces inter-GPU communication requirements by up to 857x during pre-training, without compromising training efficiency.

2. **Training flexibility -** Enables large-scale model training over consumer-grade internet connections, bypassing the need for high-speed interconnects between GPUs.

3. **Compatibility -** DisTrO is network- and architecture-agnostic, allowing it to function seamlessly across various neural network setups without additional infrastructure costs.

4. **Getting started -** DisTrO’s code isn’t available yet, Nous Research plans to release it soon, so keep an eye out for when you can start testing it in your own workflows.

———-

———-
## **Quick Bites **

———-

———-
[**GitHub has made Copilot Autofix for CodeQL, their code analysis tool, free for all public repositories.**](https://github.blog/changelog/2024-09-18-now-available-for-free-on-all-public-repositories-copilot-autofix-for-codeql-code-scanning-alerts/) This Copilot provides fixes for vulnerabilities found by CodeQL, both on pull requests and for historical alerts that already exist in a codebase. You can review and choose whether to accept these suggestions or not.

Former Apple designer [**Jony Ive is collaborating with OpenAI’s CEO Sam Altman on a new AI hardware project**](https://www.theverge.com/2024/9/21/24250867/jony-ive-confirms-collaboration-openai-hardware). While details are scarce, the startup is reportedly fundraising up to $1 billion and is exploring how generative AI can power a new computing device.

[**Apple has released public betas for iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1, featuring new Apple Intelligence tools**](https://www.theverge.com/2024/9/19/24249206/apple-intelligence-ios-18-1-public-beta) like text rewriting, a redesigned Siri, and photo object removal. These betas are available to users with iPhone 15 Pro, iPhone 16, M1 iPads, and newer Macs via the beta software program.

View image: (https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/dc855ca4-400b-48cc-8a6f-b1a9169ed38f/image.png?t=1727062155)
Caption:

———-

———-
## **Tools of the Trade **

———-

———-
1. [**RAGApp**](https://github.com/ragapp/ragapp): Build multi-agent AI applications without a single line of code. You can create and customize multiple AI agents with specific roles, prompts, and tools, then deploy them in a chat interface with streaming responses and source attribution.

2. **[Gradio AI Agent](https://app.codegpt.co/es/marketplace/agents/gradio)**: An AI Agent Assistant that can create, deploy, and optimize entire Gradio applications in Python with a simple text prompt. It uses Claude Sonnet 3.5 to build and optimize Gradio apps.

3. **[Founder Mode Analyzer](https://thewebsiteroast.com/foundermode)**: This fun AI tool analyzes your X (Twitter) profile in seconds and shows if you’re in [Founder Mode](https://paulgraham.com/foundermode.html) or Manager Mode. It assesses your tweets and interactions and helps identify your current focus and approach to business.

4. _**[Awesome LLM Apps](https://github.com/Shubhamsaboo/awesome-llm-apps)**_: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

View image: (https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/httpssubstack-post-media.s3.amazonaws.compublicimagesd0c30d79-b1cb-4941-a7d5-9646aad75c26_1736x818.png)
Caption:

———-

———-
## **Hot Takes **

———-

———-
1. I really hate the argument about whether LLMs can reason or not. Can anyone mathematically differentiate between inference and reasoning? 🙂 People treat reasoning like it’s something magical, but I bet many who argue about this issue can’t define it, relying more on gut feelings than facts. ~
**_[Chanwoo Park](https://x.com/chanwoopark20/status/1837695027436195996)_**

2. An underutilized perspective on AI for non-technical people is that we now have the world’s most advanced compression system for knowledge

Anyone can download, for free, a 235GB file that can answer questions based on a vast swath of all human writing (even if makes some errors) ~
**_[Ethan Mollick](https://x.com/emollick/status/1837543220835696761)_**

———-

———-
## **Meme of the Day **

———-

———-
Shit

View image: (https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/955ad51e-f11a-44f4-8607-90c09cde085d/image.png?t=1727033044)
Follow image link: (https://x.com/tunguz/status/1837522261244629187)
Caption: Source

———-

———-
That’s all for today! See you tomorrow with more such AI-filled content.

### 🎁 **Bonus worth $50 **💵

Share this newsletter on your social channels and tag Unwind AI ([X](https://x.com/_unwind_ai), [LinkedIn](https://www.linkedin.com/company/unwind-ai?utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind), [Threads](https://www.threads.net/@unwind_ai), [Facebook](https://www.facebook.com/profile.php?id=61561355694033&utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)) to get AI resource pack worth $50 for FREE. Valid for limited time only!

**Unwind AI** – _**[Twitter](https://twitter.com/_unwind_ai?utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)**_ | _**[LinkedIn](https://www.linkedin.com/company/unwind-ai?utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)**_** **|** **_**[Threads](https://www.threads.net/@unwind_ai)**_** | **_**[Facebook](https://www.facebook.com/profile.php?id=61561355694033&utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)**_

_**[Awesome LLM Apps](https://github.com/Shubhamsaboo/awesome-llm-apps?utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)**_** | **_**[Sponsor Us](https://sponsorunwindai.com/?utm_source=www.theunwindai.com&utm_medium=referral&utm_campaign=last-week-in-ai-a-weekly-unwind)**_

**PS:** We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

———-

Subscribe now for FREE! (https://www.theunwindai.com/subscribe)

———

You are reading a plain text version of this post. For the best experience, copy and paste this link in your browser to view the post online:
https://www.theunwindai.com/p/inference-tokens-in-wholesale

Inference Tokens in Wholesale

Comments

Leave a Reply Cancel reply