Entries tagged [mxnet]

Tuesday July 23, 2019

Apache MXNet (incubating) 1.5.0 release is now available

Today the Apache MXNet community is pleased to announce the 1.5.0 release of Apache MXNet deep learning framework. We would like to thank the Apache MXNet community for all their valuable contributions towards the MXNet 1.5.0 release.

With this release, we bring the following new features to our users. For a comprehensive list of major features and bug fixes, please check out the full Apache MXNet 1.5.0 release notes.

Automatic Mixed Precision (experimental)

Training Deep Learning networks is a very computationally intensive task. Novel model architectures tend to have increasing numbers of layers and parameters, which slow down training. Fortunately, software optimizations and new generations of training hardware make it a feasible task.

However, most of the hardware and software optimization opportunities exist in exploiting lower precision (e.g. FP16) to, for example, utilize Tensor Cores available on new Volta and Turing GPUs. While training in FP16 showed great success in image classification tasks, other more complicated neural networks typically stayed in FP32 due to difficulties in applying the FP16 training guidelines.

That is where AMP (Automatic Mixed Precision) comes into play. It automatically applies the guidelines of FP16 training, using FP16 precision where it provides the most benefit, while conservatively keeping in full FP32 precision operations unsafe to do in FP16. To learn more about AMP, check out this tutorial .

MKL-DNN: Reduced precision inference and RNN API support

Two advanced features, fused computation and reduced-precision kernels, are introduced by MKL-DNN in the recent version. These features can significantly speed up the inference performance on CPU for a broad range of deep learning topologies. MXNet MKL-DNN backend provides optimized implementations for various operators covering a broad range of applications including image classification, object detection, and natural language processing. Refer to the MKL-DNN operator documentation for more information.

Dynamic Shape(experimental)

MXNet now supports Dynamic Shape in both imperative and symbolic mode. MXNet used to require that operators statically infer the output shapes from the input shapes. However, there exist some operators that the output depends on input data or input arguments. Now MXNet supports operators with dynamic shape such as contrib.while_loop, contrib.cond, and mxnet.ndarray.contrib.boolean_mask

Large Tensor Support

Before MXNet 1.5.0, MXNet supports maximal tensor size of around 4 billon (2^32). This is due to uint32_t being used as the default data type for tensor size, as well as variable indexing. Now you can enable large tensor support by changing the following build flag to 1: USE_INT64_TENSOR_SIZE = 1.(Note this is set to 0 by default) This enabled large scale training for example large graph network training using Deep Graph Library.

Dependency Update

MXNet has added support for CUDA 10, CUDA 10.1, cudnn7.5, NCCL 2.4.2, and numpy 1.16.0.

These updates are available through PyPI packages and build from source, refer to installation guid for more details.

Gluon Fit API(experimental)

Training a model in Gluon requires users to write the training loop. This is useful because of its imperative nature, however repeating the same code across multiple models can become tedious and repetitive with boilerplate code. We have introduced an Estimator and Fit API to help facilitate training loop.

Additional feature improvements and bug fixes

In addition to the new features, we have the following improvement on existing features, performance, documentation, and bug fixes:

  1. Improved experience with front-end APIs: Python, Java, Scala, C++, Clojure, Julia, Perl, and R

  2. Several important performance improvements on both CPU and GPU

  3. Fixed 200+ bugs to improve usability

  4. Added 10+ examples/tutorials and 50+ documentation updates

  5. 100+ updates on build, test, and continuous integration system to improve MXNet contributor experience

Getting Started with MXNet:

To get started with MXNet, visit the install page. To learn more about MXNet Gluon interface and deep learning, you can follow our comprehensive book: Dive into Deep Learning. It covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. You can check out other related MXNet tutorials, MXNet blog posts, MXNet Twitter and MXNet YouTube channel.

Have fun with MXNet 1.5.0!

Acknowledgements:

We would like to thank everyone from the Apache MXNet community for their contributions to the 1.5.0 release.

Monday September 17, 2018

Announcing Apache MXNet (incubating) 1.3.0 Release

Today the Apache MXNet community is pleased to announce the 1.3 release of the Apache MXNet deep learning framework. We would like to thank the Apache MXNet community for all their valuable contributions towards the MXNet 1.3 release.

With this release, MXNet has Gluon package enhancements, ONNX export, experimental Clojure bindings, TensorRT integration, and many more features, enhancements and usability improvements! In this blog post, we briefly summarize some of the high-level features and improvements. For a comprehensive list of major features and bug fixes, read the Apache MXNet 1.3.0 release notes.

mxnet-1.3.0.png

Gluon package enhancements

Gluon RNN layers are now hybridizable: With this feature, Gluon RNN layers such as gluon.rnn.RNN, gluon.rnn.LSTM and gluon.rnn.GRU can be converted to HybridBlocks. Now, many dynamic networks that are based on Gluon RNN layers can be completely hybridized, exported and used in the inference APIs in other language bindings such as C/C++, Scala, R, etc.

Support for sparse tensor: Gluon HybridBlocks now support hybridization with sparse operators. To enable sparse gradients in gluon.nn.Embedding, simply set sparse_grad=True. Furthermore, gluon.contrib.nn.SparseEmbedding provides an example of leveraging sparse parameters to reduce communication cost and memory consumption for multi-GPU training with large embeddings.

Support for Synchronized Cross-GPU Batch Norm: Gluon now supports Synchronized Batch Normalization, available as gluon.contrib.nn.SyncBatchNorm. This enables stable training on large-scale networks with high memory consumption such as FCN for image segmentation.

Updated Gluon model zoo: Gluon Vision Model Zoo now provides MobileNetV2 pre-trained models. Updated existing pre-trained models to provide state-of-the-art performance on all ResNet v1, ResNet v2, and vgg16, vgg19, vgg16_bn, vgg19_bn models.

Introducing new Clojure bindings with MXNet

MXNet now has experimental support for the Clojure programming language. The MXNet Clojure package brings state-of-the-art deep learning to the Clojure community. It enables Clojure developers to code and to execute tensor computation on multiple CPUs or GPUs. It also enables users to write seamless tensor/matrix computations with multiple GPUs in Clojure. Now users can construct and customize state-of-art deep learning models in Clojure, and apply them to tasks such as image classification and data science challenges. To start using Clojure package in MXNet, check out the Clojure tutorials and Clojure API documentation.

Introducing control flow operators

This is the first step towards optimizing dynamic neural networks with variable computation graphs. This release adds symbolic and imperative control flow operators such as foreach, while_loop and cond. To learn more about how to use these operators, check out the Control Flow Operators tutorial.

Performance improvements

TensorRT runtime integration: TensorRT provides significant acceleration of model inference on NVIDIA GPUs compared to running the full graph in MXNet using unfused GPU operators. In addition to faster fp32 inference, TensorRT optimizes fp16 inference and is capable of int8 inference (provided the quantization steps are performed). Besides increasing throughput, TensorRT significantly reduces inference latency, especially for small batches. With 1.3 release, MXNet introduces the runtime integration of TensorRT (experimental), in order to accelerate inference. Follow the MXNet-TensorRT article on the MXNet developer wiki to learn more about how to use this feature.

MKL-DNN enhancements: MKL-DNN is an open source library from Intel that contains a set of CPU-optimized deep learning operators. In the previous release, MXNet introduced integration with MKL-DNN to accelerate training and inference execution on CPU. With 1.3 release, we have increased support for these activation functions: sigmoid, tanh and softrelu.

ONNX export support

Export MXNet models to ONNX format: MXNet 1.2 provided users a way to import ONNX models into MXNet for inference. More details are available in this ONNX blog post. With the latest 1.3 release, users can now export MXNet models into ONNX format and import those models into other deep learning frameworks for inference! Check out the MXNet to ONNX exporter tutorial to learn more about how to use the mxnet.contrib.onnx API.

Other experimental features

  1. Apart from what we have covered above, MXNet now has support for:

  2. A new memory pool type for GPU memory which is more suitable for all the workloads with dynamic-shape inputs and outputs. Set an environment variable as MXNET_GPU_MEM_POOL_TYPE=Round to enable this feature. Topology-aware Allreduce approach for single-machine GPU training. Train up to 6.6x and 5.9x faster on AlexNet and VGG compared to MXNet 1.2. Activate this feature using the “control the data communication” environmental variables.

  3. Improved Scala APIs that focus on providing type safety and a better user experience. Symbol.api and NDArray.api bring a new set of functions that have a complete signature. The documentation for all of the arguments also integrates directly with IntelliJ IDEA. The new and improved Scala examples demonstrate usage of these new APIs.

Check out further details on these features in full release notes.

Maintenance improvements

In addition to adding and extending new functionalities, the release also focusses on stability and refinements.

The community fixed 130 unstable tests improving MXNet’s stability and reliability. The MXNet Model Backwards Compatibility Checker was introduced. This is an automated test on MXNet’s continuous integration platform that verifies saved models’ backward compatibility. This helps ensure that models created with older versions of MXNet can be loaded and used with the newer versions.

Getting started with MXNet

Getting started with MXNet is simple, visit the install page to get started. PyPI packages are available to install for Windows, Linux, and Mac.

To learn more about MXNet Gluon package and deep learning, you can follow our 60-minute crash course, and then later complete this comprehensive set of tutorials, which covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. You can also check out lots of material on MXNet tutorials, MXNet blog posts (中文), MXNet YouTube channel (中文). Have fun with MXNet 1.3.0!

Acknowledgments

We would like to thank everyone who contributed to the 1.3.0 release:

Aaron Markham, Abhinav Sharma, access2rohit, Alex Li, Alexander Alexandrov, Alexander Zai, Amol Lele, Andrew Ayres, Anirudh Acharya, Anirudh Subramanian, Ankit Khedia, Anton Chernov, aplikaplik, Arunkumar V Ramanan, Asmus Hetzel, Aston Zhang, bl0, Ben Kamphaus, brli, Burin Choomnuan, Burness Duan, Caenorst, Cliff Woolley, Carin Meier, cclauss, Carl Tsai, Chance Bair, chinakook, Chudong Tian, ciyong, ctcyang, Da Zheng, Dang Trung Kien, Deokjae Lee, Dick Carter, Didier A., Eric Junyuan Xie, Faldict, Felix Hieber, Francisco Facioni, Frank Liu, Gnanesh, Hagay Lupesko, Haibin Lin, Hang Zhang, Hao Jin, Hao Li, Haozhi Qi, hasanmua, Hu Shiwen, Huilin Qu, Indhu Bharathi, Istvan Fehervari, JackieWu, Jake Lee, James MacGlashan, jeremiedb, Jerry Zhang, Jian Guo, Jin Huang, jimdunn, Jingbei Li, Jun Wu, Kalyanee Chendke, Kellen Sunderland, Kovas Boguta, kpmurali, Kurman Karabukaev, Lai Wei, Leonard Lausen, luobao-intel, Junru Shao, Lianmin Zheng, Lin Yuan, lufenamazon, Marco de Abreu, Marek Kolodziej, Manu Seth, Matthew Brookhart, Milan Desai, Mingkun Huang, miteshyh, Mu Li, Nan Zhu, Naveen Swamy, Nehal J Wani, PatricZhao, Paul Stadig, Pedro Larroy, perdasilva, Philip Hyunsu Cho, Pishen Tsai, Piyush Ghai, Pracheer Gupta, Przemyslaw Tredak, Qiang Kou, Qing Lan, qiuhan, Rahul Huilgol, Rakesh Vasudevan, Ray Zhang, Robert Stone, Roshani Nagmote, Sam Skalicky, Sandeep Krishnamurthy, Sebastian Bodenstein, Sergey Kolychev, Sergey Sokolov, Sheng Zha, Shen Zhu, Sheng-Ying, Shuai Zheng, slitsey, Simon, Sina Afrooze, Soji Adeshina, solin319, Soonhwan-Kwon, starimpact, Steffen Rochel, Taliesin Beynon, Tao Lv, Thom Lane, Thomas Delteil, Tianqi Chen, Todd Sundsted, Tong He, Vandana Kannan, vdantu, Vishaal Kapoor, wangzhe, xcgoner, Wei Wu, Wen-Yang Chu, Xingjian Shi, Xinyu Chen, yifeim, Yizhi Liu, YouRancestor, Yuelin Zhang, Yu-Xiang Wang, Yuan Tang, Yuntao Chen, Zach Kimberg, Zhennan Qin, Zhi Zhang, zhiyuan-huang, Ziyue Huang, Ziyi Mu, Zhuo Zhang.

… and thanks to all of the Apache MXNet community supporters, spreading knowledge and helping to grow the community!

Thursday May 24, 2018

Apache MXNet 1.2.0 Release is out!


Today Apache MXNet community announced the 1.2 release of the Apache MXNet deep learning framework. The new capabilities in MXNet provide the following benefits to users:


  1. MXNet is easier to use

    • New scala inference APIs: This release includes new Scala inference APIs which offer an easy-to-use, Scala idiomatic and thread-safe high level APIs for performing predictions with deep learning models trained with MXNet.

    • Exception Handling Support for Operators: MXNet now transports backend C++ exceptions to the different language front-ends and prevents crashes when exceptions are thrown during operator execution

  2. MXNet is faster

    • MKL-DNN integration: MXNet now integrates with Intel MKL-DNN to accelerate neural network operators: Convolution, Deconvolution, FullyConnected, Pooling, Batch Normalization, Activation, LRN, Softmax, as well as some common operators: sum and concat. This integration allows NDArray to contain data with MKL-DNN layouts and reduces data layout conversion to get the maximal performance from MKL-DNN. Currently, the MKL-DNN integration is still experimental.

    • Enhanced FP16 support: MXNet now adds support for distributed mixed precision training with FP16. It supports storing of master copy of weights in float32 with the multi_precision mode of optimizers. Improved speed of float16 operations on x86 CPU by 8 times through F16C instruction set.

  3. MXNet provides easy interoperability

    • Import ONNX models into MXNet: Implemented a new ONNX module in MXNet which offers an easy to use API to import ONNX models into MXNet's symbolic interface. Checkout the example on how you could use this API to import ONNX models and perform inference on MXNet. Currently, the ONNX-MXNet Import module is still experimental.

Getting started with MXNet


Getting started with MXNet is simple. To learn more about the Gluon interface and deep learning, you can reference this comprehensive set of tutorials, which covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. If you’re a contributor to a machine learning framework, check out the interface specs on GitHub.

Tuesday December 05, 2017

Milestone 1.0.0 Release for Apache MXNet

We are excited about the availability of the milestone 1.0.0 release of the Apache MXNet deep learning engine. These new capabilities (1) simplify training and deploying deep learning models, and (2) enable implementation of cutting-edge performance enhancements. The new capabilities in MXNet provide the following benefits to users:

  1. MXNet is faster: The 1.0.0 release includes implementation of cutting-edge features that optimize the performance of training and inference. Gradient compression enable users to train models up to five times faster by reducing communication bandwidth between compute nodes without loss in convergence rate or accuracy. For speech recognition acoustic modeling like the Alexa voice, this feature can reduce network bandwidth up to three orders of magnitude during training. With the support of NVIDIA Collective Communication Library (NCCL), users can train a model 20% faster on multi-GPU systems.
    • Optimize network bandwidth with gradient compression: In distributed training, each machine must communicate frequently with others to update the weight-vectors and thereby collectively build a single model, leading to high network traffic. Gradient compression algorithm enables users to train models up to five times faster by compressing the model changes communicated by each instance.
    • Optimize the training performance by taking advantage of NCCL: NCCL implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs. NCCL provides communication routines that are optimized to achieve high bandwidth over interconnection between multi-GPUs. MXNet supports NCCL to train models about 20% faster on multi-GPU systems.
  2. MXNet is easier to use: The 1.0.0 release includes an advanced indexing capability that enables users to perform matrix operations in a more intuitive manner.
    • Advanced indexing for array operations in MXNet: It is now more intuitive for developers to leverage the powerful array operations in MXNet. They can use the advanced indexing capability by leveraging existing knowledge of Numpy/SciPy arrays. For example, it supports MXNet NDArray and Numpy ndarray as index, e.g. (a[mx.nd.array([1,2], dtype = ‘int32’).

MXNet has helped developers and researchers make progress with everything from language translation to autonomous vehicles and behavioral biometric security. We are excited to see the broad base of users that are building production artificial intelligence applications powered by neural network models developed and trained with MXNet. For example, the autonomous driving company TuSimple recently piloted a self-driving truck on a 200-mile journey from Yuma, Arizona to San Diego, California using MXNet. This release also includes a full-featured and performance optimized version of the Gluon programming interface. The ease-of-use associated with it combined with the extensive set of tutorials has led significant adoption among developers new to deep learning. The flexibility of the interface has driven interest within the research community, especially in the natural language processing domain.

Getting started with MXNet

Getting started with MXNet is simple. To learn more about the Gluon interface and deep learning, you can reference this comprehensive set of tutorials, which covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. If you’re a contributor to a machine learning framework, check out the interface specs on GitHub.

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation