Introducing SWE-grep and SWE-grep-mini: Cognition’s model family for fast agentic search at >2,800 TPS. Surface the right files to your coding agent 20x faster. Now rolling out gradually to Windsurf users via the Fast Context subagent – or try it in our new playground! Read more: https://lnkd.in/gxFEvJq5
More Relevant Posts
-
🚗 Can Vision-Language Models work well in live video detection? As part of my current work on the Car Wash Monitoring System I’m currently developing, I’ve been exploring how a Vision-Language Model (VLM) can work together with YOLO for live car detection. YOLO handles the basic object detection, while every few frames (around every 10), I send an image to the VLM to check things like: "what is the car size? (Big/Small)“, "Is the hood open?”, or “Are two or more doors open?” I used threading so that the VLM runs in the background without slowing down the live stream. The reason behind this experiment was to test whether the VLM can extract extra car features (like door or hood status) that I don’t currently have models trained for — and to see if it can be a practical alternative to training new datasets. So far, the VLM gives flexible reasoning, but trained models still win in speed and stability. Both have their strengths depending on the task. Here’s a short demo of the system 👇 #ComputerVision #YOLO #VLM #DeepLearning #EdgeAI #ArtificialIntelligence #OpenCV #MachineLearning #Python #RealtimeAI #AIEngineering #Threading
To view or add a comment, sign in
-
Big update coming to PBRgen! We are working hard to make our AI-powered PBR material generator better in every way, and I’m glad to share what we’re building. First up, our new color generation method is showing great results. The materials now follow prompts much more closely and look better overall. The examples below are pure text-to-material: no guide maps, no input colors, no style images. (More on input colors later- they’ll be awesome!) We’re reworking a lot, from the backend to the UI, and making it easier and more fun for artists to use. We’re taking our time to make sure it feels solid and well-designed before we release. Can’t wait? Join our free beta here: https://lnkd.in/er2Uw4TA Thanks a lot for all your feedback and support! Cheers, Flip
To view or add a comment, sign in
-
What is the Self-Writing Internet? The Self-Writing Internet is a way to democratize access to building technology. It opens doors for more people who don't have access to traditional coding tools, allowing anyone to rapidly prototype and deploy apps and websites simply by chatting with AI. The Caffeine platform, built by our code camp partner ICP, makes this process easy for every aspiring builder out there—even those with limited technical background. Our FPV video shows the final hackathon submission stage where participants deployed their innovative applications to the Internet Computer (ICP) network, enabling resilient, data-safe apps. The final project judging and panel talks are happening now at High Grounds Café! 🇵🇭 Check out this FPV drone tour of the event! Who wants the CaffeineAI Hackathon at your nearby DEVCON chapter soon? Tell us your city in the comments! #SelfWritingInternet #CaffeineAI #InternetComputer #ICPHackathon
To view or add a comment, sign in
-
Good Overview of the state of Open Models from Nathan Lambert. Building a dedicated agent with a narrow task doesn't require a big, expensive, and slow frontier model. The open models have standardized their APIs and are available on super fast providers like Cerebras, or inexpensive providers like Fireworks.ai. We've moved a long way in a year, from a world where running Llama locally was the alternative to Open AI and Anthropic frontier models.
The PyTorch recording of my Open Models Recap talk is out. I think this a great and very timely talk, I'm very happy with it and recommend you watch it more than I'd recommend my usual content. (Thanks again to the PyTorch team -- great event) https://lnkd.in/dbJzZY35
Recapping Open Models in 2025
https://www.youtube.com/
To view or add a comment, sign in
-
Yes, even you can become technical. I spent years feeling behind in tech despite taking CS classes. It was easily overwhelming to experiment with concepts and skills I'd learned, let alone build out full-stack projects on my own. But AI coding tools have changed everything. Now you can go from idea to deployed app in hours, not months. I just published a guide on vibe coding: when it works, when it doesn't, and the exact stack I use to ship projects quickly. Whether you're validating a startup idea or building for fun, there's never been a better time to start building. Read the full post here: https://lnkd.in/gaCS2zSK #VibeCoding #BuildInPublic #AI #NoCode #StartupTools
To view or add a comment, sign in
-
ByteDance just unveiled Depth Anything 3 (DA3) — a powerful new model capable of predicting spatially consistent geometry from virtually any visual input, with or without known camera poses. Even more impressive: the entire system is built on a single plain transformer architecture and released under Apache 2.0. 🔹 Why DA3 Matters ▪️ Leverages a vanilla DINO encoder for depth estimation ▪️ Proves that one singular depth-ray representation is sufficient ▪️ Delivers major improvements over DA2 in monocular depth ▪️ Surpasses VGGT on multi-view depth and camera pose estimation ▪️ Trained entirely on public academic datasets, ensuring openness and reproducibility 🔗 Resources Discussion: https://lnkd.in/dMgakzWm Paper: arxiv.org/pdf/2511.10647 Project: https://lnkd.in/dnByyn2z Repo: https://lnkd.in/daCVz_4a Demo: https://lnkd.in/dKUZiJtx #DepthAnything3 #DA3 #ByteDanceAI #ComputerVision #3DReconstruction #DepthEstimation #MVD #AIResearch #Transformers #OpenSourceAI #MachineLearning #CVCommunity #VisionAI #DeepLearning Umar Iftikhar
To view or add a comment, sign in
-
Thrilled to share a project I've been deeply focused on: implementing a live, real-time Visual SLAM system on a modern Android device. This was a significant porting and engineering challenge. The core of the work involved: a) Porting a native C++/JNI codebase from an outdated Android 9/10 demo to run on Android 15, which involved re-writing the entire permission and scoped storage handling from scratch. b) Building the system without ARCore, relying directly on foundational libraries (OpenCV, Eigen, Boost) for the full, high-performance SLAM pipeline. c) Writing a custom 2D path-tracking algorithm to replace the original demo. d) Developing a novel low-light adaptation that smartly uses light sources as stable landmarks, improving tracking when it would normally fail (you can see the key points cluster on lights in the video). The attached video shows the final result: a real-time ORB-SLAM3 implementation that accurately maps a 2D path (top-left) in a dark environment, successfully capturing turns and curves without lag. A huge thanks to our Prof. Sujay Kadam for his invaluable guidance and mentorship on this complex problem. Special thanks as well to my Project partner, Lingampalli Venkata Subramanyam, for the collaboration and for helping me work through some of the core physics and motion concepts.
To view or add a comment, sign in
-
🤔 Have you heard about the UR Forum? Here are some of the things you can use it for: → Sharing tips → Asking questions → Troubleshooting ideas → Getting programming advice → Exploring available integrations → Learning from other people who have a robot Multiply Labs used the forum to get to know their robot, and it played a big role in helping them automate the manufacturing of life-saving medicines. Read the case study about Multiply Labs here: http://urrobots.com/iXC
Multiply Labs, case study 🎥
To view or add a comment, sign in
-
In this studio, we are building and running agentic reinforcement learning (RL) environments using OpenEnv, an open-source framework by Meta’s PyTorch team. It provides a standard for interacting with agentic execution environments through simple Gymnasium-style APIs — step(), reset(), and state(). https://lnkd.in/dwUJDnmi
To view or add a comment, sign in
-
🚀 I just built a fun little project. A real-time hand gesture recognition system to control PC's volume and screen brightness. Key Features: 1) Single hand detection: Controls only volume. 2) Dual hand detection: One hand controls volume while the other one controls the screen brightness. 3) Real-time tracking with consistent performance around 30 FPS. 4) Smooth and stable adjustments when hands are being removed.* Used OpenCV & Mediapipe for hand tracking, Pycaw & Screen_Brightness_Control for volume and brightness level adjustments, Time for FPS calculation and Numpy & math for several other calculations. * I have added a logic to lock my desired volume and brightness levels, so that when I move my hands away, the brightness and the volume stays stable. Attached: A short video showing how this project works. #ComputerVision #ComputerVisionProjects
To view or add a comment, sign in
Blog post and technical overview: https://cognition.ai/blog/swe-grep