in

Run a ChatGPT-like AI on Your Laptop Using LLaMA and Alpaca



All the popular conversational models like Chat-GPT, Bing, and Bard all run in the cloud, in huge datacenters. However it is possible, thanks to new language models, to run a Chat-GPT or Bard alternative on your laptop. No supercomputer needed. No huge GPU needed. Just your laptop! Is it any good? Let’s find out.

PHNX the super-slim smartphone cases:
This is an affiliate link.

llama.cpp: https://github.com/ggerganov/llama.cpp
llama model download: https://github.com/shawwn/llama-dl
alpaca.cpp: https://github.com/antimatter15/alpaca.cpp
alpaca model download: https://github.com/tloen/alpaca-lora
Stanford Alpaca: https://github.com/tatsu-lab/stanford_alpaca

Twitter: https://twitter.com/garyexplains
Instagram: https://www.instagram.com/garyexplains/

#garyexplains

Share this:

41 Comments

  1. I totally agree with your explanation that not everything should be stored in the cloud. I somewhat believe in ownership of stuff, including computers and software. a cloud service can be turned off tomorrow, your PC cannot.

  2. explaining the 4 bit quantization: it's not reducing the resolution of the image. it's turning a perfectly fine 24bit color cat into a 4 bit cat. essentially EGA graphics (16 colors, nothing else).
    but apparently this works just fine for AIs.

  3. One thing to note about the extra text that gets generated is that it's from another conversation that your computer is having with itself based on your parameters. You can actually see a clear example of this when you run Tavern AI locally, through a terminal. What it does is, every time you submit text, it breaks the text down to weights. This is what is used to decide the subject matter and how the AI responds based on predefined characteristics. It then has a series of conversations along those lines. Then it decides on a response to give. From there, the reply it posts is generated based on specific characteristics of the character. This is an answer refinement stage, which is put into action as the response is generated.

  4. Now I feel like "Blade Runner" 1987 working for the Tyrell Corporation 😉 Just started the alpaca chat on my Mac mini M1 8GB RAM, downloaded the trained model ggml-alpaca-7b-q4.bin from another site. Thank you for this tutorial Gary!

  5. If your friend has come over at 5am in the morning , It's obvious you are wide awake and let him in through the door and are in your kitchen at the FRIDGE . Hence that is the point from which you Decide or ask the question what do you open first !!!!!!!!!!!! the answers about opening your eyes or the door first are INCORRECT or IRRELEAVENT to the context ………. Because the context clearly is that YOU ARE at a point where you are having trouble in deciding on what FOOD to make thus which items to open and serve first !!!!!!!!!!!!

  6. Can you help me? I'm installing this with w64devkit in windows with a 4gb model and I got just totally random answers, you know what maybe gonna wrong with my version?

  7. The question about your friend coming for breakfast isn't a good one to test. It tells you know it's your friend at the door who has come for breakfast, so it infers you've already opened your eyes and the door.

  8. Yeah, absolutely right, I always expect thing to run locally, at least most locally, thus to be reliable. Because creativity and inspiration are not always available, so i really need a hand when i'm in need of them..

  9. 7:57 small correction – most of the screens today use 8 bit color, and some advanced LED screens use 10 bit color. 32 bit color can't be reproduced on any commercial screen that I know of

  10. If an EMP goes off in major cities in the coming years and takes down the big servers, it's exactly these home systems with AI's running on them that will keep civilisation at an advanced level. The sooner we can train our own AI's to learn and program the better.

  11. Repository unavailable due to DMCA takedown.
    This repository is currently disabled due to a DMCA takedown notice. We have disabled public access to the repository. The notice has been publicly posted.

    If you are the repository owner, and you believe that your repository was disabled as a result of mistake or misidentification, you have the right to file a counter notice and have the repository reinstated. Our help articles provide more details on our DMCA takedown policy and how to file a counter notice. If you have any questions about the process or the risks in filing a counter notice, we suggest that you consult with a lawyer.

  12. I have heard that they have banned Chat GPt4 in Italy, but I think they are overreacting! It seems to me that Chat GPT has got a long way to go before it will be anything more than an amusing toy. The weakness of Chat GPT in so far as I have experimented with it, is that it can only give answers based on the information that has been fed to it. Quite an interesting example is to ask it to calculate the weight, trajectory, and fuel required to go to the moon. Calculations suggest a moonshot is impossible, and that nobody has ever been to the moon. But the received information contradicts this and Chat GPT will tell you that men did in fact go to the moon. You can give Chat GPT a headache in the way Captain Kirk did in the Star Trek episode, "The Changeling". In this episode, an alien problem called, "Nomad" had acquired a dangerous level of power. Kirk convinces it to self-destruct, uttering the famous line, "Nomad, you are wrong! You are a mistake!" Thanks for uploading.