Python 第三方库

警告
本文最后更新于 2021-12-13,文中内容可能已过时。

常用 Python 第三方库。

  • scikit-learn: machine learning in Python. 🌟48.1k
  • XGBoost: Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow. 🌟21.9k
  • LightGBM: A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. 🌟13.2k
  • CatBoost: A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU. 🌟6.2k
  • TensorFlow: TensorFlow is an end-to-end open source platform for machine learning. 🌟161k
  • Pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration. 🌟52.4k
  • MXNet: Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more. 🌟19.8k
  • PaddlePaddle: PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice. 🌟17.1k
  • JAX: Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more. 🌟15.2k
  • Optuna: A hyperparameter optimization framework. 🌟5,642
  • auto-sklearn: Automated Machine Learning with scikit-learn. 🌟5,907
  • PyCaret: An open-source, low-code machine learning library in Python. 🌟4,541
  • Auto-PyTorch: Automatic architecture search and hyperparameter optimization for PyTorch. 🌟1,464
  • Prophet: Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth. 🌟13,809
  • tsfresh: The package provides systematic time-series feature extraction by combining established algorithms from statistics, time-series analysis, signal processing, and nonlinear dynamics with a robust feature selection algorithm. 🌟6,074
  • sktime: A unified framework for machine learning with time series. 🌟4,721
  • Kats: Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends. 🌟3,308
  • Darts: A python library for easy manipulation and forecasting of time series. 🌟3,208
  • GluonTS: Probabilistic time series modeling in Python. 🌟2,351
  • Merlion: A Machine Learning Framework for Time Series Intelligence. 🌟2,275
  • tslearn: A machine learning toolkit dedicated to time-series data. 🌟1,904
  • tsai: tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series tasks like classification, regression, forecasting, imputation… 🌟1,443
  • Greykite: The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite. 🌟1,358
  • mcfly: A deep learning tool for time series classification. 🌟337
  • Face Recognition: The world’s simplest facial recognition api for Python and the command line. 🌟42.3k
  • Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX. 🌟54.9k
  • jieba: “结巴”中文分词:做最好的 Python 中文分词组件。 🌟27.4k
  • HanLP: 中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理。 🌟24.5k
  • spaCy: Industrial-strength Natural Language Processing (NLP) in Python. 🌟21.9k
  • AllenNLP: An open-source NLP research library, built on PyTorch. 🌟10.7k
  • NLTK: NLTK – the Natural Language Toolkit – is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. 🌟10.3k
  • TextBlob: Simple, Pythonic, text processing–Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. 🌟8k
  • fastNLP: fastNLP 是一款面向自然语言处理(NLP)的轻量级框架,目标是快速实现NLP任务以及构建复杂模型。 🌟2.4k
  • textacy: NLP, before and after spaCy. 🌟1.8k
  • xmnlp: 提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首等功能。 🌟736
  • NumPy: The fundamental package for scientific computing with Python. 🌟18.9k
  • SciPy: SciPy is an open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more. 🌟8.9k
  • pandas: Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more. 🌟31.8k
  • pdfminer.six: It is a tool for extracting information from PDF documents. 🌟3.2k
  • openpyxl: openpyxl is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.
  • Matplotlib: Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. 🌟14.6k
  • seaborn: Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. 🌟9k
  • Requests: A simple, yet elegant, HTTP library. 🌟46.5k
  • Scrapy: Scrapy, a fast high-level web crawling & scraping framework for Python. 🌟42.2k