Python 第三方库
目录
警告
本文最后更新于 2021-12-13,文中内容可能已过时。
常用 Python 第三方库。
机器学习
- scikit-learn: machine learning in Python.
🌟48.1k
- XGBoost: Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow.
🌟21.9k
- LightGBM: A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
🌟13.2k
- CatBoost: A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
🌟6.2k
深度学习
- TensorFlow: TensorFlow is an end-to-end open source platform for machine learning.
🌟161k
- Pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration.
🌟52.4k
- MXNet: Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more.
🌟19.8k
- PaddlePaddle: PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice.
🌟17.1k
- JAX: Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more.
🌟15.2k
超参数优化
- Optuna: A hyperparameter optimization framework.
🌟5,642
自动化机器学习
- auto-sklearn: Automated Machine Learning with scikit-learn.
🌟5,907
- PyCaret: An open-source, low-code machine learning library in Python.
🌟4,541
- Auto-PyTorch: Automatic architecture search and hyperparameter optimization for PyTorch.
🌟1,464
时间序列分析
- Prophet: Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
🌟13,809
- tsfresh: The package provides systematic time-series feature extraction by combining established algorithms from statistics, time-series analysis, signal processing, and nonlinear dynamics with a robust feature selection algorithm.
🌟6,074
- sktime: A unified framework for machine learning with time series.
🌟4,721
- Kats: Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
🌟3,308
- Darts: A python library for easy manipulation and forecasting of time series.
🌟3,208
- GluonTS: Probabilistic time series modeling in Python.
🌟2,351
- Merlion: A Machine Learning Framework for Time Series Intelligence.
🌟2,275
- tslearn: A machine learning toolkit dedicated to time-series data.
🌟1,904
- tsai: tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series tasks like classification, regression, forecasting, imputation…
🌟1,443
- Greykite: The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.
🌟1,358
- mcfly: A deep learning tool for time series classification.
🌟337
计算机视觉
- Face Recognition: The world’s simplest facial recognition api for Python and the command line.
🌟42.3k
自然语言处理
- Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
🌟54.9k
- jieba: “结巴”中文分词:做最好的 Python 中文分词组件。
🌟27.4k
- HanLP: 中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理。
🌟24.5k
- spaCy: Industrial-strength Natural Language Processing (NLP) in Python.
🌟21.9k
- AllenNLP: An open-source NLP research library, built on PyTorch.
🌟10.7k
- NLTK: NLTK – the Natural Language Toolkit – is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing.
🌟10.3k
- TextBlob: Simple, Pythonic, text processing–Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
🌟8k
- fastNLP: fastNLP 是一款面向自然语言处理(NLP)的轻量级框架,目标是快速实现NLP任务以及构建复杂模型。
🌟2.4k
- textacy: NLP, before and after spaCy.
🌟1.8k
- xmnlp: 提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首等功能。
🌟736
科学计算
- NumPy: The fundamental package for scientific computing with Python.
🌟18.9k
- SciPy: SciPy is an open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.
🌟8.9k
处理数据
- pandas: Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more.
🌟31.8k
- pdfminer.six: It is a tool for extracting information from PDF documents.
🌟3.2k
- openpyxl: openpyxl is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.
可视化
- Matplotlib: Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
🌟14.6k
- seaborn: Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.
🌟9k