scikit-learn’s cover photo
scikit-learn

scikit-learn

Software Development

Open Source library for Machine Learning in Python.

About us

scikit-learn is an Open Source library for machine learning in Python.

Website
https://scikit-learn.org
Industry
Software Development
Company size
2-10 employees
Type
Nonprofit

Employees at scikit-learn

Updates

  • 📍 The place to be next week - 2 days packed with data science in Paris! We’re excited to bring together the scikit-learn community along with leaders from both technology and business for the very first edition of Probability 1.0, a scikit-learn–centric event designed to share insights, best practices, and real-world impact. A huge thank you to the organiser :probabl. and to our partners Quansight, BNP Paribas, CHANEL, and Inria for making this premiere edition possible. 🎉 Looking forward to two days of learning, collaboration, and open-source excellence.

    View organization page for :probabl.

    11,890 followers

    Announcing Probability 1.0 — A New Flagship Event for Open-Source Data Science and scikit-learn 🎟️ “We’re giving away 10 FREE tickets! Comment “probability 1.0” below to enter the draw. As part of AdoptAI, we are thrilled to share that :probabl. and scikit-learn are launching the very first edition of Probability 1.0, our annual event dedicated to the future of open-source machine learning featuring scikit-learn and its entire ecosystem. For its fir 📅 November 25–26, 2025 📍 Le Grand Palais, Paris, booth T2 Tech Demo Zone As part of Adopt AI — 25,000+ attendees, 500+ speakers, 10 stages, 250+ exhibitors https://lnkd.in/eDz5MzSE What to expect: 🎤 Talks from leading contributors to scikit-learn and the broader ecosystem, see the schedule: https://lnkd.in/eDz5MzSE 🤝 Partner sessions, collaborations, and high-value networking opportunities 🟧 A dedicated meeting area with table tops for deeper exchanges 🟦 A speaker zone Thanks to our partners for this event Inria, CHANEL, BNP Paribas, Quansight #ProbabilityOne #DataScience #MachineLearning #OpenSource #scikitlearn #Probabl #OwnYourDataScience

    • No alternative text description for this image
  • scikit-learn reposted this

    View profile for Daniel Brown

    Data Analyst | Enrolled in Master’s in Data Science Program | Proficient in Python, SQL, and Excel | Finalist at MIT Hackathon | Proven Cost-Saving Strategist

    Machine Learning doesn’t have to be hard. ML Model of the Week: DBSCAN Unlike K-Means, which forces clusters into predefined shapes, DBSCAN finds clusters of any shape by focusing on density. Points in sparse areas are marked as outliers. Applications include: - Geospatial clustering (traffic hotspots, disease mapping) - Anomaly detection (fraud, rare behaviors) - Recommendation systems (e.g., grouping movies into “cult classics,” “blockbusters,” or niche indie films, while marking truly unique ones as outliers) Visualization 1: DBSCAN detects clusters by density: dark points are cores, lighter points are borders, and gray points are noise. No predefined cluster shapes needed. Visualization 2: Density Map (KDE) Highlights areas of high point density that DBSCAN interprets as clusters, with low-density regions naturally marked as noise. It’s a clear view of the “landscape” DBSCAN sees. Visualization 3: k-Distance Plot Shows the distance to each point’s 5th nearest neighbor. The sharp increase (elbow) suggests the optimal ε value — the radius DBSCAN uses to define dense neighborhoods. Takeaway: DBSCAN doesn’t just cluster — it identifies shape and density, making it ideal for real-world, messy, or non-linear data. #matplotlib #sklearn #ml #python #jupyterlab scikit-learn

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • That’s BIG for Open Source AI and our scikit-learn project! 💥 Congrats to :probabl., the community and its incredible dynamism. This milestone shows how far open source machine learning has come, and how much stronger we are when we build together.

    View organization page for :probabl.

    11,890 followers

    🚀 We are proud to announce a €13M #seed round to raise the bar on enterprise AI adoption, starting with open source machine learning As Inria’s spin-off and official operator of scikit-learn (2.5 billion downloads), we aim to turn artisanal data science into industrial-grade infrastructure — reliable, traceable, and sovereign. ✨🌟 This round was co-led by Serena and Capital Fund Management (CFM), with renewed support from Mozilla Ventures and #FrenchTechSouveraineté operated by Bpifrance under the #France2030 plan. Welcome onboard and thank you for binding with our vision. 💪 And of course, here's the glimpse at the hearts and souls building core open source AI technology and value adding enterprise software. 🙏👇

    • No alternative text description for this image
  • View organization page for scikit-learn

    122,102 followers

    🌱 When open source sparks entrepreneurship: scikit-learn as a launchpad As a community-driven open-source library, scikit-learn has become the foundation for a dynamic ecosystem of entrepreneurs and ventures. ✨ :probabl. created by scikit-learn core developers, helping organizations scale their data science, selling data science expertise while contributing back to open source through skrub, skore, scikit-learn and many more... 🔗 https://probabl.ai and very recently: ✨ Skfolio Labs — co-founded by Hugo Delatte and Daniel Farrell, contributing and supporting Skfolio, an open-source library built on top of scikit-learn, specializing in portfolio optimization and risk management. 🔗 https://skfoliolabs.com 📢 Read the official announcements: Hugo’s post 👉 https://lnkd.in/ePwyU75u Daniel’s post 👉 https://lnkd.in/eqXHjP5r We’re excited to see new projects emerging that extend scikit-learn into specialized domains, showing how open-source innovation can spark sustainable ventures and empower whole industries. #scikitlearn #OpenSource #MachineLearning #DataScience #QuantFinance

    View profile for Daniel Farrell

    Co-Founder at Skfolio Labs | Former QIS & Derivatives Trader

    I’m very excited to share that, together with Hugo Delatte, we have co-founded Skfolio Labs!   Skfolio Labs is the company behind skfolio, the open source portfolio optimization and risk management library. We provide enterprise support, bespoke extensions and implementation guidance for teams integrating skfolio or building with it.   We strongly believe that open source is a strategic advantage for users and developers. It removes vendor lock-in, promotes best practices and enables rapid iteration.   Our aim is to build a company around skfolio whilst staying true to the open source principles that have allowed the library to thrive since its release in 2024. It’s been exciting to see skfolio already being adopted across a wide variety of applications and I’m looking forward to helping more teams put it to work in production.   Over the years I’ve been fortunate to work with some exceptional people and to help build trading desks at some of the most ambitious firms in our space. I’m incredibly grateful for the lessons, support and guidance I’ve received along the way. As long as I can remember all the smart advice I've been given(!), I’m confident that Hugo and I can build something to be proud of.   Very excited to reconnect with my network and explore new opportunities together!

  • A bunch of scikit-learn core contributors will attend or speak at PyData Paris 2025 on Tuesday and Wednesday next week. There is still time to grab a seat to attend the main conference talks and maybe even present a lightning talk ;) Ticketing, practical infos and schedule at: https://lnkd.in/epawxTkH We will also participate in the sprint day on Thursday the 2nd, but unfortunately that day has limited seats and there is already a growing dedicated waitlist. Looking forward to meeting you there, exchanging with experts of all the fields of data science, learning more about your use cases and collecting feedback for the future developments of our library and related projects!

  • scikit-learn was honored to be selected to participate in Cohort 2 of the GitHub Secure Open Source Fund (OSF) Training Program. It was an intense 3-week training program, with over 90 open source maintainers joining the training. There were numerous workshops delivered by experts at the GitHub Security Lab. ⭐️ For many of these workshops, the learning materials are publicly available, and they are shared in the article here. 🙏Thank you to Gregg Cochran, Kevin Crosby, all the instructors at GitHub Security Lab, and the funders for this pivotal opportunity and support of open source projects. #opensource #security #machinelearning https://lnkd.in/emA_yx6a

  • scikit-learn reposted this

    View profile for Virgil Chan

    Forward Deployed Engineer - Pre-Sales at Union.ai

    The temperature scaling PR scikit-learn/scikit-learn/31068 that Christian Lorentzen and I worked on has finally been merged into scikit-learn’s main branch. Probability calibration with temperature scaling will be available in scikit-learn 1.8. I’d like to thank Olivier Grisel, Christian Lorentzen, and Omar Salman for guiding me through the entire process. For every minute I spent working on the PR, they spent hours reviewing it to ensure algorithmic efficiency, numerical stability, correct unit test implementation, and rigorous documentation. I'd also like to thank David Holzmüller for his hard work in convincing the scikit-learn core maintainers that temperature scaling is a valuable addition to the library --- the merge wouldn’t have been possible without him --- and Adrin J. for his confident vote of support for me to open the PR. https://lnkd.in/gmv2K_gc

  • scikit-learn reposted this

    View organization page for :probabl.

    11,890 followers

    🔥 Many of you asked for training after we launched the official scikit-learn certifications, so at Probabl, we went further and for FREE. We’re proud to introduce Skolar, a new platform for hands-on, structured and certified training in open-source data science. 📚 All content is created by scikit-learn core contributors — the very people who build and maintain the tools and documentations you use every day. Skolar builds on things we’ve learned from the scikit-learn MOOC with Inria and certifications — and pushes it further. We designed it for people who want practical skills, real expertise, and the ability to go beyond the basics. 🧠 What’s inside Skolar: ✔️ Hands-on exercises and notebooks running directly in the browser ✔️ Quizzes to test your knowledge ✔️ Video lessons and expert-led tutorials ✔️ A growing library of structured, real-world training materials Start today with our free and open-source course: Scikit-learn Associate Practitioner Online Course Create your account now: 👉 skolar.probabl.ai See the main features: 👉 https://lnkd.in/euaPaVbe   Learn more about this project: 👉 https://lnkd.in/epc2TWb4 #datascience #machinelearning #opensource #scikitlearn #education #careerdevelopment #AI

  • 🚀 scikit-learn 1.7 is out 🚀 A big shoutout to the community of contributors who continue to push open-source machine learning forward ❤️ ✨ Key Highlights: ▶️ Improved estimator’s HTML representation ▶️ Custom validation set for histogram-based Gradient Boosting estimators ▶️ Plotting ROC curves from cross-validation results ▶️ Updated Array API support ▶️ Improved API consistency of Multi-layer Perceptron ▶️ Migration toward sparse array 🔗 Check the full release highlights: https://lnkd.in/eZx9DJx7 Discover scikit-learn 1.7 and its: 🟢 9 new features 🔵 2 efficiency improvements & 4 enhancements 🟡 6 API changes 🔴 7 fixes 👥 146 contributors (thank you all!) 📖 More details in the changelog: https://lnkd.in/eySABsYH You can upgrade with pip as usual: pip install -U scikit-learn Using conda-forge builds: conda install -c conda-forge scikit-learn #scikitlearn #MachineLearning #opensource #DataScience #Python #ML

    • No alternative text description for this image
  • scikit-learn reposted this

    View profile for Thomas J. Fan

    Member of Technical Staff at Modal

    It’s hard to believe that it’s been over six years since I joined the scikit-learn team as a maintainer. As of today, I have 1,374 commits and reviewed 3,179 pull requests. Behind these numbers, I am grateful for all the thoughtful discussions I have had with the community to push scikit-learn forward. Now that I look over my commits, I want to highlight some feature areas I am particularity fond of: 1. Everything Trees 🌲🌲🌲 - Native categorical support in Histogram-based Gradient Boosting Trees - Native missing value support in Random Forest & Trees - Cost complexity pruning In Trees 2. DataFrame interoperability 🖼️ - Pandas and Polars DataFrame output with the set_output API - get_feature_names_out: Mapping input feature names to output feature names 3. Preprocessing 🕰️ - TargetEncoder: Use the target to encode categorical data - Group infrequent categories in OrdinalEncoder and OneHotEncoder - KNN-based missing value imputation 4. Visualizations 📊 - HTML Representation to visualize estimators in Jupyter notebooks - Plotting API for evaluating or inspecting estimators 5. Experimental GPU support 🏎️ - Integrate Array API to run natively with PyTorch or CuPy arrays on a GPU I hope you found some of these features useful or discovered some of them here 😁. https://lnkd.in/ee_t5WRw

Similar pages

Browse jobs