Entries tagged [apachemxnet]
Apache MXNet (incubating) 1.5.0 release is now available
Today the Apache MXNet community is pleased to announce the 1.5.0 release of Apache MXNet deep learning framework. We would like to thank the Apache MXNet community for all their valuable contributions towards the MXNet 1.5.0 release.
With this release, we bring the following new features to our users. For a comprehensive list of major features and bug fixes, please check out the full Apache MXNet 1.5.0 release notes.
Automatic Mixed Precision (experimental)
Training Deep Learning networks is a very computationally intensive task. Novel model architectures tend to have increasing numbers of layers and parameters, which slow down training. Fortunately, software optimizations and new generations of training hardware make it a feasible task.
However, most of the hardware and software optimization opportunities exist in exploiting lower precision (e.g. FP16) to, for example, utilize Tensor Cores available on new Volta and Turing GPUs. While training in FP16 showed great success in image classification tasks, other more complicated neural networks typically stayed in FP32 due to difficulties in applying the FP16 training guidelines.That is where AMP (Automatic Mixed Precision) comes into play. It automatically applies the guidelines of FP16 training, using FP16 precision where it provides the most benefit, while conservatively keeping in full FP32 precision operations unsafe to do in FP16. To learn more about AMP, check out this tutorial .
MKL-DNN: Reduced precision inference and RNN API support
Two advanced features, fused computation and reduced-precision kernels, are introduced by MKL-DNN in the recent version. These features can significantly speed up the inference performance on CPU for a broad range of deep learning topologies. MXNet MKL-DNN backend provides optimized implementations for various operators covering a broad range of applications including image classification, object detection, and natural language processing. Refer to the MKL-DNN operator documentation for more information.
MXNet now supports Dynamic Shape in both imperative and symbolic mode. MXNet used to require that operators statically infer the output shapes from the input shapes. However, there exist some operators that the output depends on input data or input arguments. Now MXNet supports operators with dynamic shape such as contrib.while_loop, contrib.cond, and mxnet.ndarray.contrib.boolean_mask
Large Tensor Support
Before MXNet 1.5.0, MXNet supports maximal tensor size of around 4 billon (2^32). This is due to uint32_t being used as the default data type for tensor size, as well as variable indexing. Now you can enable large tensor support by changing the following build flag to 1: USE_INT64_TENSOR_SIZE = 1.(Note this is set to 0 by default) This enabled large scale training for example large graph network training using Deep Graph Library.
MXNet has added support for CUDA 10, CUDA 10.1, cudnn7.5, NCCL 2.4.2, and numpy 1.16.0.
These updates are available through PyPI packages and build from source, refer to installation guid for more details.
Gluon Fit API(experimental)
Training a model in Gluon requires users to write the training loop. This is useful because of its imperative nature, however repeating the same code across multiple models can become tedious and repetitive with boilerplate code. We have introduced an Estimator and Fit API to help facilitate training loop.
Additional feature improvements and bug fixes
In addition to the new features, we have the following improvement on existing features, performance, documentation, and bug fixes:
Improved experience with front-end APIs: Python, Java, Scala, C++, Clojure, Julia, Perl, and R
Several important performance improvements on both CPU and GPU
Fixed 200+ bugs to improve usability
Added 10+ examples/tutorials and 50+ documentation updates
100+ updates on build, test, and continuous integration system to improve MXNet contributor experience
Getting Started with MXNet:
To get started with MXNet, visit the install page. To learn more about MXNet Gluon interface and deep learning, you can follow our comprehensive book: Dive into Deep Learning. It covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. You can check out other related MXNet tutorials, MXNet blog posts, MXNet Twitter and MXNet YouTube channel.
Have fun with MXNet 1.5.0!
We would like to thank everyone from the Apache MXNet community for their contributions to the 1.5.0 release.
Announcing Apache MXNet (incubating) 1.3.0 Release
Today the Apache MXNet community is pleased to announce the 1.3 release of the Apache MXNet deep learning framework. We would like to thank the Apache MXNet community for all their valuable contributions towards the MXNet 1.3 release.
With this release, MXNet has Gluon package enhancements, ONNX export, experimental Clojure bindings, TensorRT integration, and many more features, enhancements and usability improvements! In this blog post, we briefly summarize some of the high-level features and improvements. For a comprehensive list of major features and bug fixes, read the Apache MXNet 1.3.0 release notes.
Gluon package enhancements
Gluon RNN layers are now hybridizable: With this feature, Gluon RNN layers such as
gluon.rnn.GRU can be converted to
HybridBlocks. Now, many dynamic networks that are based on Gluon RNN layers can be completely hybridized, exported and used in the inference APIs in other language bindings such as C/C++, Scala, R, etc.
Support for sparse tensor: Gluon
HybridBlocks now support hybridization with sparse operators. To enable sparse gradients in
gluon.nn.Embedding, simply set
gluon.contrib.nn.SparseEmbedding provides an example of leveraging sparse parameters to reduce communication cost and memory consumption for multi-GPU training with large embeddings.
Support for Synchronized Cross-GPU Batch Norm: Gluon now supports Synchronized Batch Normalization, available as gluon.contrib.nn.SyncBatchNorm. This enables stable training on large-scale networks with high memory consumption such as FCN for image segmentation.
Updated Gluon model zoo: Gluon Vision Model Zoo now provides MobileNetV2 pre-trained models. Updated existing pre-trained models to provide state-of-the-art performance on all ResNet v1, ResNet v2, and vgg16, vgg19, vgg16_bn, vgg19_bn models.
Introducing new Clojure bindings with MXNet
MXNet now has experimental support for the Clojure programming language. The MXNet Clojure package brings state-of-the-art deep learning to the Clojure community. It enables Clojure developers to code and to execute tensor computation on multiple CPUs or GPUs. It also enables users to write seamless tensor/matrix computations with multiple GPUs in Clojure. Now users can construct and customize state-of-art deep learning models in Clojure, and apply them to tasks such as image classification and data science challenges. To start using Clojure package in MXNet, check out the Clojure tutorials and Clojure API documentation.
Introducing control flow operators
This is the first step towards optimizing dynamic neural networks with variable computation graphs. This release adds symbolic and imperative control flow operators such as
cond. To learn more about how to use these operators, check out the Control Flow Operators tutorial.
TensorRT runtime integration: TensorRT provides significant acceleration of model inference on NVIDIA GPUs compared to running the full graph in MXNet using unfused GPU operators. In addition to faster fp32 inference, TensorRT optimizes fp16 inference and is capable of int8 inference (provided the quantization steps are performed). Besides increasing throughput, TensorRT significantly reduces inference latency, especially for small batches. With 1.3 release, MXNet introduces the runtime integration of TensorRT (experimental), in order to accelerate inference. Follow the MXNet-TensorRT article on the MXNet developer wiki to learn more about how to use this feature.
MKL-DNN enhancements: MKL-DNN is an open source library from Intel that contains a set of CPU-optimized deep learning operators. In the previous release, MXNet introduced integration with MKL-DNN to accelerate training and inference execution on CPU. With 1.3 release, we have increased support for these activation functions:
ONNX export support
Export MXNet models to ONNX format: MXNet 1.2 provided users a way to import ONNX models into MXNet for inference. More details are available in this ONNX blog post. With the latest 1.3 release, users can now export MXNet models into ONNX format and import those models into other deep learning frameworks for inference! Check out the MXNet to ONNX exporter tutorial to learn more about how to use the mxnet.contrib.onnx API.
Other experimental features
Apart from what we have covered above, MXNet now has support for:
A new memory pool type for GPU memory which is more suitable for all the workloads with dynamic-shape inputs and outputs. Set an environment variable as
MXNET_GPU_MEM_POOL_TYPE=Roundto enable this feature. Topology-aware Allreduce approach for single-machine GPU training. Train up to 6.6x and 5.9x faster on AlexNet and VGG compared to MXNet 1.2. Activate this feature using the “control the data communication” environmental variables.
Improved Scala APIs that focus on providing type safety and a better user experience.
NDArray.apibring a new set of functions that have a complete signature. The documentation for all of the arguments also integrates directly with IntelliJ IDEA. The new and improved Scala examples demonstrate usage of these new APIs.
Check out further details on these features in full release notes.
In addition to adding and extending new functionalities, the release also focusses on stability and refinements.
The community fixed 130 unstable tests improving MXNet’s stability and reliability. The MXNet Model Backwards Compatibility Checker was introduced. This is an automated test on MXNet’s continuous integration platform that verifies saved models’ backward compatibility. This helps ensure that models created with older versions of MXNet can be loaded and used with the newer versions.
Getting started with MXNet
To learn more about MXNet Gluon package and deep learning, you can follow our 60-minute crash course, and then later complete this comprehensive set of tutorials, which covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. You can also check out lots of material on MXNet tutorials, MXNet blog posts (中文), MXNet YouTube channel (中文). Have fun with MXNet 1.3.0!
We would like to thank everyone who contributed to the 1.3.0 release:
Aaron Markham, Abhinav Sharma, access2rohit, Alex Li, Alexander Alexandrov, Alexander Zai, Amol Lele, Andrew Ayres, Anirudh Acharya, Anirudh Subramanian, Ankit Khedia, Anton Chernov, aplikaplik, Arunkumar V Ramanan, Asmus Hetzel, Aston Zhang, bl0, Ben Kamphaus, brli, Burin Choomnuan, Burness Duan, Caenorst, Cliff Woolley, Carin Meier, cclauss, Carl Tsai, Chance Bair, chinakook, Chudong Tian, ciyong, ctcyang, Da Zheng, Dang Trung Kien, Deokjae Lee, Dick Carter, Didier A., Eric Junyuan Xie, Faldict, Felix Hieber, Francisco Facioni, Frank Liu, Gnanesh, Hagay Lupesko, Haibin Lin, Hang Zhang, Hao Jin, Hao Li, Haozhi Qi, hasanmua, Hu Shiwen, Huilin Qu, Indhu Bharathi, Istvan Fehervari, JackieWu, Jake Lee, James MacGlashan, jeremiedb, Jerry Zhang, Jian Guo, Jin Huang, jimdunn, Jingbei Li, Jun Wu, Kalyanee Chendke, Kellen Sunderland, Kovas Boguta, kpmurali, Kurman Karabukaev, Lai Wei, Leonard Lausen, luobao-intel, Junru Shao, Lianmin Zheng, Lin Yuan, lufenamazon, Marco de Abreu, Marek Kolodziej, Manu Seth, Matthew Brookhart, Milan Desai, Mingkun Huang, miteshyh, Mu Li, Nan Zhu, Naveen Swamy, Nehal J Wani, PatricZhao, Paul Stadig, Pedro Larroy, perdasilva, Philip Hyunsu Cho, Pishen Tsai, Piyush Ghai, Pracheer Gupta, Przemyslaw Tredak, Qiang Kou, Qing Lan, qiuhan, Rahul Huilgol, Rakesh Vasudevan, Ray Zhang, Robert Stone, Roshani Nagmote, Sam Skalicky, Sandeep Krishnamurthy, Sebastian Bodenstein, Sergey Kolychev, Sergey Sokolov, Sheng Zha, Shen Zhu, Sheng-Ying, Shuai Zheng, slitsey, Simon, Sina Afrooze, Soji Adeshina, solin319, Soonhwan-Kwon, starimpact, Steffen Rochel, Taliesin Beynon, Tao Lv, Thom Lane, Thomas Delteil, Tianqi Chen, Todd Sundsted, Tong He, Vandana Kannan, vdantu, Vishaal Kapoor, wangzhe, xcgoner, Wei Wu, Wen-Yang Chu, Xingjian Shi, Xinyu Chen, yifeim, Yizhi Liu, YouRancestor, Yuelin Zhang, Yu-Xiang Wang, Yuan Tang, Yuntao Chen, Zach Kimberg, Zhennan Qin, Zhi Zhang, zhiyuan-huang, Ziyue Huang, Ziyi Mu, Zhuo Zhang.
… and thanks to all of the Apache MXNet community supporters, spreading knowledge and helping to grow the community!
Apache MXNet 1.2.0 Release is out!
Today Apache MXNet community announced the 1.2 release of the Apache MXNet deep learning framework. The new capabilities in MXNet provide the following benefits to users:
- MXNet is easier to use
- New scala inference APIs: This release includes new Scala inference APIs which offer an easy-to-use, Scala idiomatic and thread-safe high level APIs for performing predictions with deep learning models trained with MXNet.
- Exception Handling Support for Operators: MXNet now transports backend C++ exceptions to the different language front-ends and prevents crashes when exceptions are thrown during operator execution
- MKL-DNN integration: MXNet now integrates with Intel MKL-DNN to accelerate neural network operators: Convolution, Deconvolution, FullyConnected, Pooling, Batch Normalization, Activation, LRN, Softmax, as well as some common operators: sum and concat. This integration allows NDArray to contain data with MKL-DNN layouts and reduces data layout conversion to get the maximal performance from MKL-DNN. Currently, the MKL-DNN integration is still experimental.
- Enhanced FP16 support: MXNet now adds support for distributed mixed precision training with FP16. It supports storing of master copy of weights in float32 with the multi_precision mode of optimizers. Improved speed of float16 operations on x86 CPU by 8 times through F16C instruction set.
- Import ONNX models into MXNet: Implemented a new ONNX module in MXNet which offers an easy to use API to import ONNX models into MXNet's symbolic interface. Checkout the example on how you could use this API to import ONNX models and perform inference on MXNet. Currently, the ONNX-MXNet Import module is still experimental.
Getting started with MXNet
Getting started with MXNet is simple. To learn more about the Gluon interface and deep learning, you can reference this comprehensive set of tutorials, which covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. If you’re a contributor to a machine learning framework, check out the interface specs on GitHub.
Apache MXNet 0.12 Release Adds Support for New NVIDIA Volta GPUs and Sparse Tensor
We are excited about the availability of Apache MXNet version 0.12. With this release, MXNet adds two new important features: support for NVIDIA Volta GPUs and support for Sparse Tensors
The MXNet v0.12 release adds support for NVIDIA Volta V100 GPUs, enabling users to train convolutional neural networks up to 3.5 times faster than on the Pascal GPUs. Trillions of floating-point (FP) multiplications and additions for training a neural network have typically been done using single precision (FP32) to achieve high accuracy. However, recent research has shown that the same accuracy can be achieved using half-precision (FP16) data types.
The Volta GPU architecture introduces Tensor Cores. Each Tensor Core can execute 64 fuse-multiply-add ops per clock, which roughly quadruples the CUDA core FLOPS per clock per core. Each Tensor Core performs D = A x B + C, where A and B are half-precision matrices, while C and D can be either half or single-precision matrices, thereby performing mixed precision training. The new mixed-precision training allows users to achieve optimal training performance without sacrificing accuracy by using FP16 for most of the layers of a network, and higher precision data types only when necessary.
You can take advantage of Volta Tensor Cores to enable FP16 training in MXNet by passing a simple command, "--dtype float16" to the MXNet training script. For example, you can invoke imagenet training script with command:
train_imagenet.py --dtype float16
MXNet v0.12 adds support for sparse tensors to efficiently store and compute tensors allowing developers to perform sparse matrix operations in a storage and compute-efficient manner and train deep learning models faster. MXNet v0.12 supports two major sparse data formats: Compressed Sparse Row (CSR) and Row Sparse (RSP). The CSR format is optimized to represent matrices with a large number of columns where each row has only a few non-zero elements. The RSP format is optimized to represent matrices with a huge number of rows where most of the row slices are complete zeros. For example, the CSR format can be used to encode the feature vectors of input data for a recommendation engine, whereas the RSP format can be used to perform the sparse gradient updates during training. This release enables sparse support on CPU for most commonly used operators such as matrix dot product and element-wise operators. Sparse support for more operators will be added in future releases.
Follow these tutorials to learn how to use the new sparse operators in MXNet.
Or, You can download and play with MXNet easily using one of the options below:
- The Pip package can be found here: https://pypi.python.org/pypi/mxnet
- The Docker Images can be found here: https://hub.docker.com/u/mxnet/
If you want to learn more about MXNet visit https://mxnet.incubator.apache.org/. Finally, you are welcome to join and also invite your friends to the dynamic and growing MXNet community by subscribing to firstname.lastname@example.org