Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major update #512

Merged
merged 17 commits into from
Jun 27, 2023
Merged

Major update #512

merged 17 commits into from
Jun 27, 2023

Conversation

maljoras
Copy link
Collaborator

@maljoras maljoras commented Jun 26, 2023

Summary

  • Major update from the development branch.
  • New training algorithms: Chopped Tiki-taka, AGAD
  • Major re-organization of AnalogTiles for increased modularity (TileWithPeriphery, SimulatorTile, SimulatorTileWrapper). Analog tile modules (that might have multiple simulator tiles) are now also torch Module and used by analog layers. New analog tiles can be customized.
  • Added CustomTile
  • Added TorchInferenceTile for a fully torch-based analog tile for inference (not using the C++ RPUCuda engine), supporting a subset of MVM nonidealities
  • New inference presets StandardHWATrainingPreset
  • New inference noise model ReRamWan2022NoiseModel (see also Noise model fitted and characterized on RRAM devices (programming noise and output-referred read noise) #394)
  • Improved HWA-training for inference featuring input and output range learning and more
  • Improved memory management (using torch cached GPU memory for internal RPUCuda buffer, significantly reducing the memory needed for running analog models)
  • Change in generators: analog_model.analog_tiles() now loops over all available tiles (in all modules)
  • Generator: analog_layers() loops over layer modules (except AnalogSequential) and is replacing analog_modules()
  • New correlation detection example for comparing specialized analog SGD algorithms
  • Simplified build_rpu_config script for generating RPUConfigs for analog in-memory SGD
  • Import and file position changes. However, user can import RPUConfig related things from aihwkit.simulator.config
  • convert_to_digital utility
  • Change: convert_to_analog now also considered mapping. Set mapping.max_out_size = 0 and mapping.max_out_size = 0 to avoid this.
  • Added logical TileModuleArray for logical weight matrices large than a single tile. Mapped layers now use this tile array
  • Change: Checkpoint structure is different. utils.legacy_load provides a way to load old checkpoints
  • realistic_read_write is removed from some high-level function. Use program_weights (after setting the weights) or read_weights for realistic reading (using weight estimation technique).
  • New training preset: ReRamArrayOMPresetDevice, ReRamArrayHfO2PresetDevice, ChoppedTTv2*, AGAD*
  • Pulse counters for pulsed analog training
  • Dumping of all C++ fields for accurate analog training saving and training continuation after checkpoint load.
  • apply_write_noise_on_set for pulsed devices
  • Reset now also for simple devices
  • SoftBoundsReference, PowStepReference for explicit reference subtraction of symmetry point in Tiki-taka
  • Analog MVM with output-to-output std-deviation variability (output_noise_std)
  • per_batch_sample weight noise injections for TorchInferenceRPUConfig

Details

  • make black for code formatting.
  • Switched to python 3.10, torch >= 1.9, C++ 14, numpy >=1.22

FIXED: Read_weights could have applied the scales wrongly (if learning of out scaling was used)

Now forward is called, not analog_forward. To avoid confusion analog_forward is renamed to joint_forward as it called pre_forward, tile.forward, and post_forward. It applied the mapping scales but not the learnable parts like digital bias and out scaling alpha.

ADDED: Output noise per output column support

Here forward.out_noise_std parameter is introduced which enables a systematic output-to-output variation of the output noise std dev. This is in relative terms, eg. 0.3 means 30% relative variation around out_noise.

The parameter values are drawn at instantiation. However, the can be modified, eg.:

rpu_config = InferenceRPUConfig()
rpu_config.forward.out_noise_std = 0.1  # 10% variation around out_noise
rpu_config.forward.out_noise = 0.1
model = AnalogLinear(4, 2, bias=True, rpu_config=rpu_config)

analog_tile  = next(model.analog_tiles())  # get individual tiles
dic = analog_tile.get_forward_parameters()
dic['out_noise_values'][0] = 0.23  # e.g. sets the output noise std-dev of the first output line
analog_tile.set_forward_parameters(dic)

CHANGED: Generators

  • analog_modules generator loops through all AnalogModuleBase instances including AnalogSequantial, AnalogWrapper
  • analog_layers generator loops through all AnalogModuleBase instances excluding AnalogSequantial, AnalogWrapper

ADDED: Plot the error for a particular device

Each analog training tile has the program_weights method which uses SGD and pulsed update to program the weights using the device properties. This can be used to make a weight error plot which could be compared to inference results with the phenomenological noise_model.

Example:

import matplotlib.pyplot as plt
import numpy as np

from aihwkit.utils.visualization import plot_programming_error

from aihwkit.simulator.presets import (
    MixedPrecisionEcRamMOPreset,
    MixedPrecisionReRamESPreset,
    StandardHWATrainingPreset,
)

figure = plt.figure(figsize=[10, 5])
ax = figure.add_subplot(1, 1, 1)

for preset, name in [(StandardHWATrainingPreset(), 'PCM model (1h)'),
                     (MixedPrecisionReRamESPreset(), 'ReRAM [GDP]'),
                     (MixedPrecisionEcRamMOPreset(), 'EcRAM MO [GDP]'),
                     ]:
    plot_programming_error(
        preset,
        w_range=(-0.8, 0.8),
        n_bins=51,
        t_inference=3600.,
        realistic_read=True,  # otherwise no drift compensation
        label=name
    )

plt.legend()
plt.show()

This would produce:
image

ADDED: presets for new training algorithms and convenient way to define new presets

Here a new function is added to generate presets for any analog training algorithm and device.

For example,

from aihwkit.simulator.configs import build_config
from aihwkit.simulator.presets import StandardIOParameters, ReRamSBPresetsDevice

rpu_config = build_config(algorithm, ReRamSBPresetDevice, StandardIOParameters)

would generate a valid configuration for training algorithm which could be sgd, tiki-taka, ttv2, mp, c-ttv2, or agad

ADDED: Torch inference tile

It was hard to make quick changes to the c++ backend, and we, therefore, created a pure torch-based version.
Added code for a pure torch-based tile. However, note that it supports only a subset of all non-idealities of the RPUCuda based tile.

  • New torch analog tile (similar to base tile) and new torch tile that is similar to the c++ tile.
  • Example showing how to use new torch tile.
  • Tests for testing the functionality compared to c++ based tile.

@maljoras maljoras requested review from kaoutar55 and kkvtran June 26, 2023 21:25
kkvtran
kkvtran previously approved these changes Jun 26, 2023
Signed-off-by: Malte Rasch <[email protected]>
@maljoras maljoras requested a review from kkvtran June 27, 2023 12:03
@maljoras maljoras merged commit 79e4900 into IBM:master Jun 27, 2023
@maljoras maljoras deleted the version-update branch June 27, 2023 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants