in

Local AI Just Got Easy (and Cheap)



This Google TPU makes local AI simple…

Full Blog Tutorial:

Product Links (some are affiliate links)
– Coral USB TPU 👉 https://amzn.to/3vUkUJH
– Zima Board 👉 https://amzn.to/42dYHCr
– Raspberry Pi 5 👉 https://amzn.to/3HBSBSI
– Coral PCIe TPU 👉 https://amzn.to/3vLKfVW
– M.2 Adapter 👉 https://amzn.to/3OnchgZ

Explore the groundbreaking Coral AI mini PCIe accelerator by Google, a game-changer for home automation and DIY projects, in my latest video where I integrate this innovative chip with the ZimaBoard and Zima Blade for superior performance and cost-effectiveness. Discover how this setup outperforms others, like the Raspberry Pi 5, in speed and thermal efficiency, and follow my journey from troubleshooting software issues to successfully running Frigate, an advanced home lab computer vision system. Learn how this affordable, under $100 setup can revolutionize your home tech projects!

Monitor your security cameras with locally processed AI
Frigate is an open source NVR built around real-time AI object detection. All processing is performed locally on your own hardware, and your camera feeds never leave your home.

https://coral.ai/products/m2-accelerator-ae
https://frigate.video/
https://mqtt.org/

Share this:

42 Comments

  1. nice work, binge-watching your stuff! what's the screen recording software you're using with the rounded-borders face cam bottom right?

  2. Suddenly, someone needed to sell a bunch of slow 3 year old inference chips with very limited use. This is their story.

    An RTX4090 is more energy efficient than an equivalent number of these things assuming no sparsity @ 660 INT8 TOPS for ~380W (ignoring the dual Epyc or PCIe switch cards needed to run 160 of them in the same machine and still come in below the base clock int8 performance). I suppose you could attempt to run that many off USB somehow on a 16 lane consumer processor but it won't be able to cope with the interrupt management and USB overhead since it's a garbage protocol. That number is wrong since the 4090 will be running quite a bit faster than base clock at 380W draw but it doesn't matter. 0.576W / TOP vs 0.8125W / TOP. Sparsity potentially halves that number. Of course most of the models that can be quantized to INT8 will run in INT4 as well so you could be running everything at ~1.3 INT4 PetaOPS.

    Asus makes the only PCIe cards that hold 16 of them per card and according to this incredibly fat gay black trans-man you might as well cut your junk off and join scientology before expecting any of their products to work correctly. They're either going to require a 1/1/1/1/1… etc 1×16 way bifurcation nothing supports, or Asus is going to be charging you 500-600 for a 2-generation outdated PCIe switch that they've probably programmed wrong if their redrivers on Threadripper boards are any indication. The fact that they need a 2-slot card to dissipate 54W on these is a good indication. Their new boards with USB 4 state that the cables shouldn't be removed or swapped until all power to the motherboard is killed. You know, USB-C connectors, those famously stable things. Their last threadripper board managed to have slots that stopped working if you used other slots on a platform with 128 lanes available, and lacked enough 8-pin PCI-e power to the motherboard to run more than 2 GPUs… and it sounded like less of a disaster than their old x99 boards.

    TL;DR if you're going to buy outdated tech junk, instead of feeding money to whatever goofdouche here is promoting, hop on ebay and treat your home network to an upgrade to P2P 56Gb Ethernet / Infiniband dual port cards and some SR4 transceivers so you can quit being insulted by things like 2.5Gb ethernet in brand new machines, or any of the FPGAs that are available that I'm just gonna go ahead and bet will destroy the performance of these things (plus people might actually hire you for "Languages known: VHDL / Verilog" on a resume, they won't for "I SET UP PYTHON AND DETECTED FACES!!!!!!11").

  3. I'm curious if it's possible to use the pcie version via USB with a simple adapter. I understand that it might affect speeds but could it at least hypothetically function? I'm trying to research this ATM. Also the USB version is about $30 more than the USB version so… ya know. Pinching pennies where I can and all that,

  4. Person detection. Training the AI to hunt us more efficiently. lol
    This is an amazing little thing. I do have an issue with the AI online. Having a local device that can be use to monitor and do what it has to do on an offline local network is amazing.

  5. I'm curious whether the tensor flow models are updatable. Ie, could an enterprising person add a model which detects an approaching person (vs departing)… same with a vehicle.

  6. The usb one isn't only slower because the interface, it's also underpowered. The one you got isn't the greatest either, you should have got the dual edge which is by far the best price:performance ratio

  7. using ffmpeg just like that is …. dumb. You need serious reencoding performance AND you need to crank the settings as high as possible – so that the image quality is decent so that the inference can make actual f*cking guesses and doesnt miss all the time because you're sending a pixelated 320×240-mess. OR : (hear me out) you could use a proper webcam or surveillance cam which pumps the stream out itself.

  8. the annoying thing with google tensorflow is how often stuff is broken with new releases. Over the last 20 years, I've lost count of how many times Google decided to break backward compatibility in their projects. Google's idea of open source kinda blows.

  9. My goal is to make a llm one day but i dont have a good enough graphics card to handle llm of higher sizes.

    Can you give me an idea on what you would doo for strictly an LLM but one that can handle a 70b llm

    Ideas for small but powerfull sets or soemthing even specific for that would be cool. Was ready to buy a zima blade and board and coral thing but i want the best without buying an a100