As a leading telecommunications company on a global scale, Huawei is dedicated to trusted programming. To achieve this goal, our team of developers relies heavily on the support provided by Rust programming language. As a founding member of the Rust Foundation, Huawei is fully committed to the growth and prosperity of the Rust community and sponsoring Rust Conferences, including RustConf, Rust Nation UK, EuroRust, GoSim and Rust China Conference.
To demonstrate our commitment, we create development tools and deep code learning technology to help software engineering with Rust, and are continuously contributing to the improvement and expansion of Rust’s features, including those listed in the roadmaps on language, compiler and library teams.
Macros 2.0, a GOSIM talk given by Rust compiler expert Vadim Petrochenkov.
A Tale of Binary Translation, a Rust Nation UK talk given by Rust library expert Dr Amanieu d’Antras.
Support asynchronous destruction of objects running out of scope:
Implement code generation for asynchronous drop
Implement prevention measures for running asynchronous drop synchronously
Implement type system pre-requisites for (2) if necessary - type system refactoring, linear types
Implement any other features in rustc compiler that will help Rust language team to make a decision about including async drop to the language
It’s currently not possible to implement an efficient and strictly correct sequence lock in Rust, which is a very important low level synchronization primitive.
Reading the data from a sequence lock relies on the ability to read data that might be concurrently modified, checking afterwards if a data race occured (by checking an atomic sequence counter), and only then using the data after the check. (See the RFC for a detailed explanation.)
However, in the memory model of Rust (and C++), reading data that is concurrently modified results in a data race (undefined behavior) immdiately, not just when using the data later.
Crates like seqlock
work around this by using ptr::read_volatile
, but the correctness is debatable.
C++’s P1478 adds
atomic_load_per_byte_memcpy
and atomic_store_per_byte_memcpy
to provide a solution for that problem.
Rust should have something similar, but something that fits better into Rust’s type and safety system.
Rust RFC 3301 is
a proposal for a generic type AtomicPerByte<T>
that allows for concurrent reads and writes that do not result in undefined behavior,
while still allowing tearing to allow reads and writes of any size.
It leverages the MaybeUninit<T>
type to represent types in potentially invalid (teared) state.
The &std::ffi::CStr
type is used to represent (null terminated) C strings in Rust.
Currently, this type has some subtle issues and is in many ways less ergonomic than the regular string type.
Some of the areas of potential improvement:
There is no syntax for a &CStr
literal. (Update: We now have the experimental c"…"
syntax.)
Due to a language limitation, &CStr
is currently represented as a pointer+size pair.
It should instead be just a pointer without a size, since the size is already determined by the null terminator.
Because of this, conversion from *const c_char
requires scanning the whole string to find its size.
That should be a free/nop conversion instead.
See the note in this documentation.
For the same reason, &CStr
cannot be passed through a C FFI boundary.
Ideally, &CStr
should have the same ABI as a *const c_char
.
CStr
has far fewer useful methods than str
(e.g. finding, splitting, replacing, etc.) making it hard to work with directly.
format!()
can only produce a String
, not a std::ffi::CString
.
All bytes in a CStr
(before the terminator) are non-zero, but none of the
methods use NonZeroU8
to leverage the type system for this invariant.
Currently, CPU feature detection macros such as is_x86_feature_detected!
are only available in
std
. This is because, on most platforms, feature detection requires information from the
operating system, which is not available in core
by design.
However, an alternative design would be to expose an API for manually enabling features: the set of
enabled features would be available in core
, but code in std
would be responsible for querying
the set of supported features from the OS on startup and marking them as enabled in core
.
This will allow code in core
and alloc
to take advantage of CPU-specific optimizations such as
using string processing instructions to accelerate certain functions.
Const generics allows the use of values in the type system. Most often used for arrays with a generic size: [u32; N] where N can be an arbitrary usize. It improves code clarity, reusability and the general experience of working with arrays. This can reduce heap allocations and increase performance. Changes to the standard library relying on const generics also increased the compilation speed and documentation quality.
One of the fundamental properties of BTreeMap
is that it maintains elements in sorted order and
enables efficient element lookup in O(log(N))
time. However the current API is overly fitted
towards a key-value API like a HashMap and fails to expose the ability to make queries about
“nearby” keys. For example, finding the first element whose key is greater than X.
This proposal adds Cursor
and CursorMut
types to BTreeMap
based on similar
types for LinkedList
.
Support syntactic sugar for conveniently delegating implementation to other types.
Implement a compiler pass detecting delegation-like patterns in code
Run the pass on code from crates.io, analyze found delegation patterns, analyze previous delegation proposals
Implement syntactic sugar for common delegation patterns as a procedural macro or a built-in language feature
Write an RFC that proposes including delegation into Rust language
Accurate diagnostic locations for tools like linters in presence of macro expansions.
Discuss specific cases that need improvements
Formalize the suggested improvements and implement them
Rust’s diagnostics working group is leading an effort to add support for internationalization of error messages in the compiler, allowing the compiler to produce output in languages other than English.
Translated error messages will allow non-native speakers of English to use Rust in their preferred language.
On most platforms, these structures are currently wrappers around their pthread equivalent, such as
pthread_mutex_t
. These types are not movable, however, forcing us to wrap them in a Box
,
resulting in an allocation and indirection for our lock types. This also gets in the way of a
const constructor for these types, which makes static locks more complicated than necessary.
In terms of performance, this feature has enabled at least 2x speedup for reducing the overhead of locks, while in the extreme situations, super-linear speed up for multiple-core/multithread usage scenarios.
Inline Assembly enables many applications that need very low-level control over their execution, or access to specialized machine instructions.
The Keyword Generics Initiative is a new initiative in Rust with the goal researching the ability to abstract over the color of functions or “effects”. See the official announcement post for more details. It is currently not possible to nicely abstract over “const-ness” and “async-ness” in stable Rust.
Enable parallel compilation in rustc to improve compilation efficiency.
It has already been implemented in the Nightly. See the following blog for details: https://blog.rust-lang.org/2023/11/09/parallel-rustc.html
Polymorphization is a code-size optimisation, aimed at reducing unnecessary monomorphization, thereby reducing the quantity of LLVM IR generated. By reducing the quantity of generated LLVM IR, it is expected that time spent in LLVM during compilation will decrease, resulting in improved overall compilation times.
The format_args!()
macro and underlying std::fmt::Argument
type form the
basis of all printing and formatting machinery in the Rust standard library,
such as println!()
, format!()
, and write!()
.
By improving the macro and the in-memory representation of the fmt::Arguments
type, a significant reduction in binary size can be achieved. This is
especially important for embedded software running on devices with limited
storage and memory.
To be more flexible, currently working on the lifetime of hygenic macros to ensure safe expansion of unsafe code in the macros.
Features and tools to aid in the creation and usage of dynamic libraries and research towards developing a new ABI and a new in-memory representation for interoperability across high-level programming languages that have safe data types.
Huawei supports the efforts of the Rust project and Rust Foundation towards a complete and accurate official Rust language specification.
The Rust language specification will have a big impact on the ability to write safety critical software in Rust, will improve the internal consistency of the language, and be an important tool to greatly improve the precision and completeness of existing and future language proposals.
rustc already has support on stable for split debuginfo on Windows (*.pdb
) and macOS (*.dSYM
),
but is missing support for split debuginfo on Linux (Split DWARF’s *.dwp
/*.dwo
files).
Large applications built with debug information have slow linking times, can experience out-of-memory failures at link time and slow debugger start-up times. Furthermore, debuginfo in these applications may result in a significant increase in storage requirements and additional network traffic in distributed build environments.
Nearly all thread_local! { … }
variables are of Cell
or RefCell
type,
since without interior mutability, they’d just be thread local constants.
The variable created by the thread_local! { … }
macro (of type LocalKey<T>
)
is used by calling .with(|v| ..)
on it, to restrict the lifetime to stay within the current thread.
(See this example.)
The values inside a Cell
and RefCell
are used through methods like .set()
, .get()
, .borrow()
, etc.
This results in a lot of verbosity to access a simple thread local (mutable) integer:
LOCAL.with(|v| v.set(123))
.
RFC 3184 adds methods on LocalKey<Cell<T>>
and LocalKey<RefCell<T>>
to shorten such cases to just: LOCAL.set(123)
.
Additionally, LOCAL.set(value)
directly initializes the thread local with the specified value,
unlike LOCAL.with(|v| v.set(value))
, where with
will (on the first call) initialize
the thread local with the default value first and set
will then immediately destroy that value by overwriting it with the new value.
Start an initiative with the goal of replacing the current trait system implementation of rustc.
This new implementation should fully replace both fulfill and evaluate and offer an API a lot closer
to the ideal of chalk
/a-mir-formality
.
Currently working on opaque type support for the next generation trait solver. With that all stable type system features are supported by the new solver. Will still need a lot of small improvements afterwards: both to remove the last dependencies on the old solver and to avoid breaking stable code when enabling the new solver by default.
Right now this is mostly about leaking private types from public interfaces (like functions returning private types), and lints trying to prevent it.
People have been disagreeing what “public” means in this context, and the RFC specifies new design based on reachability that will match people’s intuition better. Ivakin Kirill is implementing the reachability algorithms and lints now.
docs.rs
is a crucial part of the Rust developer’s toolbelt and Huawei sponsor
continued development of docs.rs
features and of maintenance of the project.
hashbrown
is the hash table implementation used in Rust’s standard library.
This crate is performance-critical for both the Rust compiler and many Rust programs.
Huawei sponsors the continued developement of hashbrown features and maintenance of the project.
rustc_codegen_gcc
adds support for GCC as backend for the Rust compiler. It’ll allow to support a lot more of compilation targets while also benefiting from GCC advantages.
rustdoc
is a crucial part of the Rust developer’s toolbelt and Huawei sponsor
continued development of rustdoc features and of maintenance of the project. Contributions have
included reduction in the size of generated documentation, intra-doc-links, -> *
query support
and more.
Analsysis of diagnostic warnings from non-machine-applicable Clippy lints finds 21
warnings per kLOC are found on average of crates.io projects, whilst 0.49 warnings per kLOC
in the Rust compiler. By learning from manual warning fixes in rust-lang/rust
Git
repositories, warnings can be automatically fixed through code transformations and machine learning.
Inlay hints are auxiliary information provided to an Rust editor/IDE by the Rust Analyzer.
Converting hints such as type information, parameter names, into the code, it helps train a machine learning model with what programmers could get with the help of an IDE.
A much faster alternative of HashMap, for very small maps. It is also faster than FxHashMap, hashbrown, ArrayMap, IndexMap, and all others. The smaller the map, the higher the performance. It was observed that when a map contains more than 20 keys, it may be better to use the standard HashMap, since the performance of micromap::Map may start to degrade. See the benchmarking results below.
Training a machine learning model to efficiently classify Rust functions and blocks as unsafe or safe (using 374 mLOC from crates.io). Accuracy on functions is 95%, and accuracy on blocks is 85%. Results can be obtained in less than a second while editing the item, faster than waiting for compilation.
Rust Library Team
Consultant with Ada Lab, Ireland Research Centre, Huawei
Moscow Research Centre, Huawei
Research Scientist, Ireland Research Centre, Huawei
Intern, Ada Lab, Ireland Research Centre, Huawei
Rust Compiler Team Co-Lead
Programming Languages Lab, Edinburgh Research Centre, Huawei
Rust Developer Tools Team
Ada Lab, Paris Research Centre, Huawei
Open Source Management Centre, Shenzhen, Huawei
Open Source Management Centre, Shenzhen, Huawei
Rust Compiler Team
Consultant with Ada Lab, Ireland Research Centre, Huawei
Ada Lab, Ireland Research Centre, Huawei
Open Source Management Centre, Shenzhen, Huawei
Rust Library Team Lead
Consultant with Ada Lab, Ireland Research Centre, Huawei
Open Source Management Centre, Shenzhen, Huawei
Rust Compiler Team
Programming Languages Lab, Edinburgh Research Centre, Huawei
Fundamental Software Engineering Lab, Waterloo Research Center, Huawei
Ada Lab, Ireland Research Centre, Huawei
Open Source Management Centre, Shenzhen, Huawei
Rust Compiler Team
Moscow Research Centre, Huawei
Trustworthiness Software Engineering and Open Source Lab, Huawei
Rust Parallel Compiler Working Group
Open Source Management Centre, Shenzhen, Huawei
Ada Lab, Ireland Research Centre, Huawei
Open Source Management Centre, Shenzhen, Huawei
Fundamental Software Engineering Lab, Waterloo Research Center, Huawei
Open Source Management Centre, Shenzhen, Huawei
Library Development for Embedded Systems, Hangzhou, Huawei
Director of System Programming Lab, Russian Research Centre, Huawei
Director of Ada Lab, Ireland Research Centre, Huawei
Moscow Research Centre, Huawei
Project Manager of MemSafePro Y4 project, Huawei TTE Lab
Research Areas:
Software Engineering,
Open Source and Innersource
Opportunities:
To apply for Rust positions, please send CV
,
Positions
Research Areas:
Ownership/Lifetime Analysis,
Program Synthesis
Research Areas:
Functional Programming,
Sugaring/Desurgaring,
Algorithm Synthesis
Opportunities:
Postdoc position
,
Oversea PhD
Research Areas:
Deep Code Learning,
Software Engineering
Opportunities:
Research Engineer
Research Areas:
Empirical Studies,
Bug Localization
© Trusted Programming Team, Huawei