Exploring Facebook’s LLAMA 3 AI: Local Processing with Promising Outcomes

I’m experimenting with LLAMA 3 from Facebook. There’s a modification called llama3-gradient:8b-instruct-1048k-q6_K, which has a context window of 1M tokens (that’s about 2 megabytes). And there’s even more. I feed it the entire book about Elon Musk (highly recommend it, by the way!) and it produces a pretty good summary—and does it quickly, any text from a screenshot is generated in about 40-60 seconds. And yet, it’s still relatively a weaker model (8B), while Facebook has a 70B. But the main feature here is that all this works locally on a laptop. No need to pay for API, it works quite fast, the script is small, fits on one screen.

Still, there are some rough edges—for example, for direct questions about the text (questions to which I definitely know the answers), the system does not always confidently provide answers. When you send significantly less text, it works fine.

Exploring Facebook’s LLAMA 3 AI: Local Processing with Promising Outcomes | May 05 2024, 19:03

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply