|
|
|
|
Changelog for geopm-0.6.0+dev73gb985ae9-1.1.x86_64.rpm :
Tue Oct 2 14:00:00 2018 Christopher M. Cantalupo v0.6.0 - Stabilized Agent code path. - Last release with Decider/Platform/PlatformImp support. - Modified implementations and interfaces: - Modify PowerGovernor to ignore DRAM power and tune parameters for power balancer. - Profile larger set of MPI functions including non-blocking routines. - Removed push_region_signal_total() and sample_region_total() from PlatformIO. - This functionality is available to Agents by creating an instance of RegionAggregator. - Redesigned geopmanalysis command line interface so that the first argument selects the analysis type. - Add options to geopmanalysis for min and max frequency for frequency sweep analysis types. - Remove geopmanalysis --level option and replace with --summary and --plot. - This allows summaries and/or plots to be generated separately. - Add option to use agent code path to geopmanalysis (use_agent). - Change EnergyEfficientAgent frequency map to use JSON format. - Introducing GEOPM_EXEC_WRAPPER environment variable useful for inserting a debugger into the integration tests. - Reuse same idx val for repeated pushes of signals/controls. - Cat lscpu output to /tmp prior to running job and avoid popen call inside of MPI app. - Change PowerGovernorAgent::wait() to use time instead of RAPL updates. - Get rid of C-string from ProfileTable implementation. - Add max_level() to TreeComm. - Introducing the PowerGovernor class. - Introducing Agent::aggregate_sample() static helper function for Agents. - Add agent field to io.py dataframe index. Note: this will break compatibility with scripts that use the old index. - Rename RAPL related MSR names: SOFT_POWER_LIMIT to PL1_POWER_LIMIT and HARD_POWER_LIMIT to PL2_POWER_LIMIT. - Add geopm_time_since() method. - Update the analysis.py energy references. - Add RegionAggregator class for per-region signal totals. - Update Reporter to use RegionAggregator. - Changed region counts to start at -1 before first entry. - Get rid of unused and undocumented environment variable GEOPM_REPORT_VERBOSITY. - Modify launcher to set LD_PRELOAD only for application. - Change some AppOutput methods to return pandas Dataframes instead of Report/Region objects. - Add barrier in MPI_Init prior to GEOPM startup. - Have RootRole throw if bad power cap is set. - Updated features: - Introducing the new PowerBalancer agent with many commits since v0.5.1 that tweak the algorithm. - Ignore epoch calls when made inside of a region marked with the ignore hint. - Add MSRIOGroup signals that return the raw value of an MSR. - Use slurm option to select the performance power governor when using GEOPM. - Add a spec file for building GEOPM for ALCF Theta. - Add profile name and agent to trace header. - Add CYCLES_THREAD and CYCLES_REFERENCE to trace. - Add Agent support in python scripts. - Add CORAL 2 version of AMG to examples. - Update markup for miniFE example to set region ID once per region. - Update nekbone patches for scaling studies. - Suppress OMP warnings in launcher when using Intel toolchain. - Add PowerSweepAnalysis type to geopmanalysis. - Add BalancerAnalysis type to geopmanalysis. - Add NodeEfficiencyAnalysis type to geopmanalysis. - Add NodePowerAnalysis type to geopmanalysis. - Introduce a plotter method to generate histograms. - Have ManagerIO skip policy file parsing if agent has no policies. - Add HDF5 caching for parsed reports and traces to io.py. - Add summary features to analysis where summarized data is written to files in ascii tables. - Updated and extended integration tests: - Updates to integration tests to support the Agent / PlatformIO code path are a major feature of this release. - Adding back integration test for power balancer with increased time limit. - Automatically infer architecture based on hostname. - Add monitor as available agent to run integration tests. - Use regular runtime for epoch in test_region_runtimes. - Require balancer test to run in an allocation. - Checks average power limit across nodes is under cap in test_power_balancer. - Add integration test that runs GEOPM, but does not generate reports. - Updates to documentation: - Add documentation to the README about the scaling_governor. - Add documentation of constructor attribute for plugins to geopm(7) man page. - Add documentation for hint ignore interaction with geopm_prof_epoch(). - Add documentation for all of the supported region hints. - Remove documentation about node barrier enforced by epoch call, this is no longer true. - Remove reference to MPIEXEC from spec file. - Add missing launcher options to help text. - Updated unit tests: - Add PowerBalancer unit tests. - Add PowerBalancerAgent unit tests. - Add analysis.py unit tests. - Add more detailed checks of TreeComm calls to KontrollerTest. - Add tests of geopmanalysis CLI. - Fix tests for ControlMessage. - Bug fixes: - Fix catch-value warning from GCC 8. - Fix possible C string truncation. - Fix for null characters sometimes appearing in report header. - Fix string sizing for strncpy and snprintf for gnu8. - Fix null termination in case of string overflow. - Fix in PowerGovernorAgent where fan_in could be accessed out of bounds. - Fix Kontroller index into Agent array; the level 0 Agent should not do descend() or ascend(). - Fix issue where second region runtime is longer than first: move region exit barrier after call to sample. - Fix geopmagent so it can create empty json files. - Fix launcher to handle --cpu-bind as well as --cpu_bind. - Fix failure to restore fixed counter MSRs at end of GEOPM runtime. - Fix epoch region ID detection in io.py. - Fix for test_trace_runtimes with agent code path. - Fix performance issue: if power will be controlled, adjust one CPU per package. - Fix EnergyEfficientAgent init(). - Fix issue where geopm would try to restore MSR MISC_ENABLE which is read only. - Fix test_power_consumption to measure socket power only. - Fix order of MSR save / agent init() to avoid failure to restore time window setting. - Fix --enable-overhead configure option - Fix pthread launch for Agent code path. - Fix Fortran comm initialization. - Fix handling of bad OMP masks. - Fix for klocwork error: missing null check. - Fix pthread launch when using MPICH by enabling MPI_THREAD_MULTIPLE in environment. - Fix pthread launch issue in Cray Linux by using secure versions of the CPU_SET macros. - Fix hang when runtime is active but report has not been requested. - Fix python scripts to support old data missing separate dram energy in report. - Fix python scripts to handle new agent field in parsed header. - Fix race in ControlMessage that could cause hang at GEOPM runtime start up. - Fix for ompt region names in Reporter. - Fix issue where slack was calculated prior to adding in extra power in PowerBalancingAgent.
Sat Jun 23 14:00:00 2018 Brad Geltz v0.5.1 - GEOPM beta hotfix release! - Introduce the PowerGovernorAgent. This agent is implemented and fully featured. - Restoring the MSR values at the end of a run is now best effort since the system whitelist may prevent the write from being allowed. - Allow min/max frequencies to be specified in the EnergyEfficientAgent\'s policy. - Fix geopmread usages for tutorial. - Fix MSR overflow logic, performance counter initialization, and MSR encode/decode functions. - Fix integration tests for geopmwrite use cases.
Wed May 30 14:00:00 2018 Christopher M. Cantalupo v0.5.0 - GEOPM beta release! - Community updates: - New landing page - New Slack channel - New Code of Conduct - New pull request template - Contributing instructions updated with details of gerrit review process. - Modified implementations and interfaces: - Major refactor of the controller and plugin architecture is provided as an optional new code path. - Most of the changes made to the implementation for this release modify the new code path. - The old code path is still available for users as long as the controller is run without the GEOPM_AGENT environment variable set. - The new code path will be active if the user selects an agent by name with the GEOPM_AGENT environment variable when launching the controller. - The old code path is maintained in the current Controller object along with the the Decider / Platform / PlatformImp plugins. - The new code path is maintained in a replacement for the Controller which has been temporarily named the Kontroller. - The Kontroller will be renamed the Controller after this release, and the old code path will no longer be available. - Similar to the Kontroller/Controller replacement, the KprofileIOGroup KprofileIOSample and KruntimeRegulator are temporary replacements for their non-K counterparts and will be renamed. - The beta release enables a new set of plugin interfaces named the IOGroup, Agent, and Comm. - It is through the IOGroup, Agent and Comm plugins that the GEOPM runtime can be extended. - The Decider / Platform / PlatformImp plugin extensions are deprecated and will be removed after this release. - The IOGroup plugin enables a user to add new signal and control mechanisms for an Agent to read and write. - The Agent plugin enables a user to add new monitor and control algorithms to the GEOPM runtime. - MPI use by the GEOPM runtime which is not linked by application has been completely encapsulated in the Comm object. - The tutorial has been extended with two new directories: tutorial/agent and tutorial/iogroup. - The tutorial/iogroup directory documents how to write an IOGroup plugin. - The tutorial/agent directory documents how to write an Agent plugin. - The interface to the resource manager has been made much more flexible for supporting the new Agent interfaces. - The resource manager interface is documented in the geopm_agent_c(3) and geopm_endpoint_c(3) man pages. - Additionally command line tools have been proposed and partially implemented to support the interfaces documented in those man pages. - The geopm_agent_c(3) APIs and geopmagent(1) CLI has software support. - The endpoint interfaces are a work in progress that has not yet been integrated into the mainline source. - The PlatformIO object provides the interface to the IOGroups. - The PlatformIO C++ object will soon have an associated C interface documented as geopm_platformio_c(3). - The geopmread and geopmwrite provide a CLI to the PlatformIO features. - Introducing the MSRIOGroup which provides an implementation of the IOGroup for MSRs. - Introducing the TimeIOGroup which provides an IOGroup for the time signal. - Introducing the CpuinfoIOGroup which provides data from /proc/cpuinfo as signals. - Introducing the ProfileIOGroup which provides profile data collected from the main compute application through the geopm_prof_c(3) APIs. - The release includes three new installed binaries: geopmread, geopmwrite, and geopmagent. - Each of these command line interfaces is documented with a man page and there is a man page for a future command line tool called geopmendpoint. - Deprecated geopm_policy_ *() interfaces that have been replaced with the geopm_agent_ *() and geopm_endpoint_ *() APIs. - Introducing the first three Agent implementations: MonitorAgent, PowerBalancerAgent, and EnergyEfficientAgent. - Introducing PlatformTopo, replacement for PlatformTopology. - Introducing DefaultProfile singleton which supports geopm_prof_c(3) APIs for profiling. - Added documentation for monitor, energy_efficient, and power_balancer Agents, but the implementation is not currently aligned. - The monitor agent is implemented and fully featured. - The energy_efficient agent will soon be extended to match the man page, and currently use of the network is not enabled. - The existing implementation of the energy_efficient agent does currently provide similar functionality to the efficient_freq Decider. - The power_balancer agent is a work in progress that is not well aligned with the man page, but will be feature complete soon. - Reports and traces generated by Agent code path are designed to be backward compatible with reports and traces generated with the Decider code path. - New environment variables documented in geopm(7): GEOPM_ENDPOINT, GEOPM_AGENT, GEOPM_TRACE_SIGNALS, and GEOPM_DISABLE_HYPERTHREADS. - Remove GEOPM_ERROR_AFFINITY_IGNORE environment variable, no longer required for testing. - New plugin registration mechanism has been put in place and new factory has been implemented. - Replace independent factories with single templated class the PluginFactory. - No longer register a plugin using a half instantiated object. - Removed call to dlsym, and plugins now use __attribute__((constructor)) to specify a callback target used when plugin is loaded. - In this callback the plugin should register with its respective factory. - Each plugin type has a make_plugin() static method that creates the plugin object and returns a pointer to the base class. - The make_plugin() function pointer is what is registered with the factory. - Extend the PluginFactory to require a the registration of a dictionary (map) to enable queries of plugin capabilities. - Use stricter criterion for selecting plugin files to load, name must be of the form libgeopmpi *.so.0.0.0 where 0.0.0 is the GEOPM ABI version. - Moved geopm_plugin_description_s definition to geopm.h. - Add a configure option to enable use of the msr-safe ioctl interface for writing with PlatformIO. - The msr-safe ioctl interface should not be used for writing unless the system has an msr-safe installation that has fixed . - Added APIs for manipulating hint bits in region id hash. - Many changes were made to modernize the use of C++. - Change protected members of all classes to private where possible. - Replace all raw pointer usage with C++11 smart pointers if possible. - Use default keyword for constructors and destructors where appropriate. - Use delete keyword rather than throw to avoid copy constructor. - Add override keyword to derived classes. - Use forward declaration of classes rather than include one header inside of another. - Add and integrate make_unique implementation for C++11. - Confirmed const correctness for all class methods. - Add public interface to register IOGroups with PlatformIO which enables IOGroups to be created at runtime. - Standardize the IOGroup signal and control names so that they are prefixed by the IOGroup name and two colons. - Agents should generally use high level aliases rather than these low level signals and controls. - Introduce functions for converting between signals and bit-fields to allow for PlatformIO to provide full 64 bit integer signals like the region ID. - Add overflow function type to MSR class. - Change frequency APIs to use Hz to enforce uniform use of SI units. - Use instruction offset in OMPT derived region name; this resolves a name ambiguity when more than one OpenMP region is discovered within the same function. - Use gmock archive uploaded to the geopm organization on github. - PlatformTopo is built on top of lscpu and does not require hwloc. - Throw on GlobalPolicy misconfiguration earlier in the runtime execution. - Rename SimpleFreqDecider to EfficientFreqDecider which will be replaced by EnergyEfficientAgent. - Update to efficient Decider and Agent related environment variables according to above name changes. - The json-c library is no longer a dependency, all references have been removed. - Now using the json11 library which is distributed in the \"contrib\" sub-directory. - Updated features: - Enable Agent to augment report and trace. - Enable user to augment trace through environment variable GEOPM_TRACE_SIGNALS in new code path. - Changes to PlatformIO to support non-CPU domains. - Added MSR save/restore functionality to PlatformIO save/reset interfaces. - Allow loading PlatformIO when some IOGroups fail to load. - Add aggregation functions to PlatformIO to encode how to combine signals. - Add PlatformTopo methods for converting domain to string and vice-versa. - Add signal_names() and control_names() to PlatformIO and IOGroup. - Add Skylake server (SKX) as a supported platform. - Add Haswell and SandyBridge MSRs to PlatformIO interface. - OMPT report region names include instruction offset, now two OpenMP regions within the same function can be distinguished. - Add region runtime as default trace column. - Simpler column names in trace; print some columns using old names. - Change region ID to hex in report and trace. - Order regions in report by runtime. - Add application total ignore time to report. - Replace tabs with spaces for report formatting. - Enable PlatformIO to support Epoch based signals. - Add power signals to PlatformIO using derivative calculation previously done in Region object. - Add PlatformIO aliases for region ID, progress, frequency and energy. - Add CombinedSignal class which is used to combine signals from different IOGroups. - Allow for a user provided number of experiment iterations (loops) to perform for each geopmanalysis type - Enable geopmanalysis to provide more detailed information about the results - Allow turbo to be skipped by geopmanalysis when determining the best per-region frequencies. - Updates to geopmanalysis python script to bypass trace parsing if requested and in debug plot ignore check for multiple profile names. - Use hyphen instead of underscore in geopmanalysis options for consistency with other interfaces. - Don\'t require -n and -N with geopmanalysis when skipping launch. - Pass output_dir through to plotter when using geopmanalysis. - Changes to analysis.py for SC17 data: multiply energy percent by 100, have frequency sweep plots use frequencies from profile name. - Add geopmanalysis option to specify controller launch method. - Updated and extended integration tests: - Integration tests validated with the GEOPM_AGENT set to test new code path. - A few problems with the new code path exposed by integration tests have been added to github issues. - A few changes to support integration tests with new code path have been integrated. - Change io.py and integration tests: Allow hex numbers for region ID in report, skip extra lines in report. - Remove Platform plugin registration. - Update EfficientFreqDecider to use new runtime metric for performance. - Update EfficientFreqDecider to use PlatformIO directly and remove method from Policy object for adjusting frequency. - Updated unit tests: - Many unit tests have been added to accompany the new code path which has many new classes. - The new classes were specifically designed to enable unit testing poorly covered code that it refactors. - Refactor Profile constructor into testable functions. - Add unit tests for Profile class. - Simple profile class in test directory for testing and debug: enables profiling of the GEOPM runtime itself. - More detailed checks of messages in unit tests when exceptions are thrown. - Fix test-license to assert that files in MANIFEST.EXEMPT exist. - Remove TestPlugin code that is not used by tests. - Add make check target to tutorial build. - Bug fixes: - Update GEOPM runtime C APIs to print to standard error instead of having the controller suppress error messages. - Handle exceptions that occur during app/controller handshake. - Enable timeout rather than hang if Controller or application fail during execution. - Fix for package-scoped MSRs that will write to all CPUs in a package rather than just one. - Fix HSX and SKX frequency control MSRs to core domain. - Fix issue when running on systems with offline CPUs. - Do not report a completed send if policy or sample contains a NAN. - Fix lscpu parsing for offline CPUs. - Exclude regions with 0 count from report, except unmarked region, which is always 0. - Add verbose error message when PluginFactory::dictionary() is called with plugin name that has not been registered. - Fix get_alloc_nodes for slurm in geopmpy launcher - Fix for test_power_consumption to checks the current platform cpuid to decide power budget. - Fix geopmpy.launcher for Intel\'s mpiexec: does not accept -- as a separator for positional arguments. - Fix for when GEOPM_PLUGIN_PATH contains multiple paths. - Fix tutorial tarball so that it will build out of place. - Fix shared memory issues during start-up when launching the Controller as a separate application. - Remove erroneous double split of the Controller\'s comm; the ppn1 comm is already passed into the constructor. - Fix test to use in-memory file system to avoid adding missing msync() calls. - Fix resource leak in TreeCommunicator constructor. - Fix tracing capability with geopmanalysis. - Leave -- separator in list of arguments to avoid parsing command line arguments intended for application as launcher arguments.
Fri Jan 12 13:00:00 2018 Christopher M. Cantalupo v0.4.0 - Modified implementations and interfaces: - Updated algorithm for choosing CPU affinity in the launcher: fill application CPUs from back to front, and never share physical cores between MPI ranks. - Created new abstraction for interfacing with MSRs and more broadly for abstracting hardware IO (PlatformIO, MSRIO, and MSR classes). - Application region hints are now properly exposed to the decider. - Added geopmanalysis executable to the geopmpy package; this executable runs applications and performs analysis of power and performance based on GEOPM report and trace data. - Added geopmbench to the installed binaries; this is simply an installed version of the tutorial_6 executable. - Added GEOPM_RM environment variable and --geopm-rm command line option to select geopmpy.launcher\'s back end resource manager. - Updated man pages to include geopmanalysis and geopmbench. - Removed handling of SIGCHLD signal in GEOPM runtime (commonly raised in non-error conditions when using popen(3)). - Launcher will guess correct number of OpenMP threads if user has not specified. - Added warning message at start up if report and trace files will not be created due to permissions issues. - Added better error handling to tutorial sources. - Added support for geopmctl to be run as a different user than application. - Added support for user provided shmkey\'s that do not begin with \'/\'. - Added error checking in launcher user requests more ranks per node than there are cores per node. - Added more robust error checking for command line issues in launcher. - Added command line option to launcher to exclude use of hyperthreads: --geopm-disable-hyperthreads. - If a plugin fails at registration time, do not bring down the controller; a warning is printed if debug is enabled. - Remove -s parameter from geopmctl CLI (was being ignored). - Encapsulated use of MPI by GEOPM inside of a class abstraction (IComm), but controller has not been modified to use the new class due to deadlock bug. - Encapsulated in a class the handshake interface between the controller and the application across shared memory. - General clean up of the geompy.plotter implementation. - Added more error checking in Controller. - Some fixes for issues exposed by static analysis. - Updated features: - Added new decider called \"simple_freq\" that adjusts CPU frequency to save energy with a small impact to performance; name will likely change to \"efficient_freq\" in the future. - Added region runtime reporting to traces and Region objects based on the average execution time of a region by all of the ranks on a node. - Added a method to the Region object to give access to the telemetry time stamps to the decider. - Added online learning approach to energy efficient frequency decider. - Added support to geopmpy.launcher for launching with Intel(R) MPI\'s mpiexec. - Added option to plotter to use all samples or just epoch samples. - Modified the tutorials to enable use of the geopmpy launcher. - Improved tutorial Makefile to allow user override of GNU Make standard variables. - Added an RPM spec file for use with the OpenHPC distribution. - Updated and extended integration tests: - Moved Controller death test from the unit tests to the integration tests. - Added integration tests for pthread an application launch of the controller. - Added an isolated hardware test for RAPL power limit functionality. - Updated documentation: both man pages and doxygen have been reviewed and cleaned up. - Updated unit tests: - Added unit test for SubsetOptionParser. - Reduced dependence of unit tests on MPI runtime. - Removed MPIProfileTest unit test which is covered by integration tests, and not really a unit test. - Removed unused MPIControllerTest. - Removed MVAPICH2 Fortran tests. - Bug fixes: - Fixed broken build in tutorials (tutorial_region.c). - Fixed faulty argument parsing by the geopmpy launcher. - Fixed error reporting when using geopmpy with python 3.x. - Fixed issues with affinity when launching the controller as a pthread. - Fixed issue in passing power budgets down a multi-level tree. - Fixed issue in platform choice when head node architecture differs from the compute nodes. - Fixed broken build if --disable-doc configuration option is passed. - Fixed decider setup code to correctly propagate power bounds down tree. - Fixed the way RAPL time window is set. - Fixed the use of cached data by geopmpy.plotter. - Fixed integration test issues related to systems with multiple cluster node partitions. - Fixed process CPU affinity implementation (don\'t use hwloc) and added unit tests for this. - Fixed potential overflow issue with error messages in PlatformImp.cpp. - Fixed race in SharedMemory test. - Fixed markup patch for MiniFE. - Fixed launcher when user explicitly requests OMP_NUM_THREADS=1. - Fixed MPIInterfaceTests so it uses only mocked MPI interfaces, and does not explicitly require MPI. - Fixed memory leaks in GlobalPolicy. - Fixed linking order of libgeopm and libmpi. - Fixed non-performance mode integration test launcher. - Fixed issue where libgeopmpolicy had false dependence on OMPT.cpp - Fixed rpm Makefile target to avoid the rpmbuild -t option to avoid trying to use the OpenHPC spec file. - Fixed issue where platform topology could be determined from nodes other than the ones that run the job. - Fixed Intel(R) MPI launcher\'s use of host files and the --ppn CLI. - Fixed incompatibility between MVAPICH2 affinity and srun affinity. - Fixed test_progress_exit integration test to account for extrapolation error. - Fixed integration test for MPI time accounting. - Fixed launcher problem when node is listed in multiple queues by sinfo. - Fixed and improved affinity assignment in corner cases. - Fixed use of sched_getcpu() for Mac OS X.
Mon Jun 19 14:00:00 2017 Christopher M. Cantalupo v0.3.0 - GEOPM alpha release! - Modified implementations and interfaces: - Added job launch wrapper script which simplifies GEOPM runtime launch. - Added plotting support for visual analysis of report and trace data. - Added python package: geopmpy for supporting python infrastructure (job launch/plotting). - Added support for OMPT integration with the OpenMP runtime to mark GEOPM region entry and exit. - Added support for PMPI interface use in fortran applications enabling full support for fortran applications. - Added support to profile individual MPI functions as distinct regions. - Added support for transmission of region hints from the application to the controller. - Removed MPI_Pcontrol() interface for wrapping geopm_prof_ *() interfaces. - Removed geopm_ctl_spawn() interface. - Removed geopm_prof_disable() interface. - Changed to single aggregated report file per run instead of one per node. - Changed the geopm_tprof_ *() interfaces for thread progress. - Changed GEOPM classes to derive from a pure virtual interface base class. - Changed RPM build from RPM makefile in favor of geopm.spec.in/configure. - Changed the report and trace file format to have headers with meta-data. - Changed how the GEOPM_PROFILE environment variable is used: now dictates the profile name. - Changed geopm_ctl_c interface to no longer be application facing. - Changed requirement for power plane 0 controls: MSR no longer used/needed. - Changed all application hints from *POLICY_HINT * to *REGION_HINT *. - Changed build time wget/curl timeout periods to be longer. - Updated features: - Added support for per-cpu progress reporting from application. - Added hint to ignore time spent in a region such that ignored region times are subtracted from epoch times. - Added policy information to report. - Added user id to shmkey prefix to avoid permissions issues with stale keys. - Added man page for the geopmpy python package, geopmsrun and geopmaprun. - Added documentation for new features and interface changes. - Added cache file support to plotter. - Added interface to Region object to get per-cpu progress. - Added feature to track mpi runtime per region and print in the report. - Added feature to treat unmarked code as a real region. - Added support to resolve OMPT function address to a name in report. - Added support launcher keeping controller off of Linux CPU 0 if possible. - Added support for hyper-threads and multi socket system affinity support in launcher. - Added significant rework of Environment class to avoid security issues. - Added geopm_env_debug_attach() API. - Added region hint support in the ModelRegion wrappers for integration tests. - Added mvapich2 fortran90 test suite for testing GEOPM fortran interfaces. - Added autotools make check support for python unit tests. - Added standard PIP packaging of the geopmpy python package and posting on PYPI. - Added build infrastructure for support for LLVM OpenMP runtime with OMPT enabled. - Updated and extended integration tests: - Added support for using launcher wrapper within integration tests. - Added integration test for OMPT and MPI automatic region detection. - Added better support for the integration test looping script. - Added integration test job timeouts. - Added proper clean up of reports when a test passes. - Added setting of OMP_NUM_THREADS when running integration test. - Added test to compare the regions detected in the trace to the report. - Added integration test for MPI timing. - Updated unit tests: - Added unit tests for the Environment and SharedMemory classes. - Added python unit test for affinity settings in the launcher script. - Added support for edge cases in unit tests. - Bug fixes: - Fixed geopmpolicy to generate a whitelist file without requiring root. - Fixed critical security issues from static analysis. - Fixed missing symbol wrappers for init and finalize MPI fortran functions. - Fixed buffer overflow in MPI API test. - Fixed missing resize of m_level to the active number of levels per node in the TreeCommunicator. - Fixed issue where gfortran does not support bit shift operations of more that 32 bits. - Fixed shared memory cleanup at attach time. - Fixed issue where PlatformImp was initialized twice. - Fixed reporting of unmarked regions. - Fixed bugs in plotter. - Fixed const issue with MPI-2/MPI-3 interface definitions. - Fixed big-o scaling for all2all ModelRegion. - Fixed integration tests for unmarked regions. - Fixed test_progress_exit integration test. - Fixed standard directory specificiation in the spec file - Fixed test_sample_rate integration test. - Fixed check_run issue in scaling integration test. - Fixed integration tests and unit tests to handle the new node-combined report with header format. - Fixed launcher to check for srun affinity plugins before using them. - Fixed fortran configure test for MPI-3 support. - Fixed gfortran test to work with ubuntu. - Fixed mac compile issues. - Fixed fortran test makefile. - Fixed documentation to remove all references to geopmkey.
Wed Apr 5 14:00:00 2017 Christopher M. Cantalupo v0.2.3 - Fixed broken OBS build of version 0.2.2. - Fixed broken integration test for region timing.
Tue Apr 4 14:00:00 2017 Christopher M. Cantalupo v0.2.2 - Modified implementations and interfaces: - Added environment variable GEOPM_RUN_LONG_TESTS to enable long running integration tests. - Added environment variable GEOPM_KEEP_FILES to leave temporary files created by unit tests. - Added environment variable GTEST_XML_DIR to configure location of junit xml output from unit tests. - Changed documentation for geopm_epoch(): multiple calls per application is okay. - Changed geopm_epoch() calls in examples to reflect new usage. - Changed GoverningDecider to use much simpler and more effective algorithm. - Changed all TreeCommunicator MPI runtime communication to send binary data: do not use MPI data marshaling. - Changed all TreeCommunicator MPI runtime communication to one-sided MPI_Put() calls. - Changed tuning for parameters used by BalancingDecider. - Changed tuning for RAPL time window settings. - Changed TDP percentage to double throughout code. - Changed copyright dates for 2017. - Updated features: - Added least squared linear regression to calculate derivative. - Added compiler optimizations for Intel when using Intel toolchain. - Added environment control GEOPM_PROFILE_TIMEOUT of application timeout when waiting for controller. - Added warning message about stale keys. - Added throttling percentage to reports. - Added GEOPM runtime/memory/network overhead calculation and reporting. - Added --enable-overhead configure option for heavy-weight overhead measurement. - Added support for Cray MPI. - Added region IDs to report files. - Added junit xml output from unit tests. - Added energy hardware counter update sample triggering (reduce latency and jitter). - Added memory buffering for trace object, buffer size is hardcoded to 128 MB (should be configurable). - Added rpmbuild --nocheck support (check definition in spec file). - Added minimal documentation about CPU affinity requirements. - Added an example that will print affinity of MPI processes and OpenMP threads. - Added a stability fix for power calculation that will be made more robust. - Updated examples: - Added CoMD to examples. - Added QBOX to examples. - Added AMG to examples. - Updated and extended integration tests: - Added support for ALPS to integration tests. - Added support for resource manager detection. - Added support for integration test environment configuration options. - Added support for better signal handling to integration tests. - Added integration tests that use the trace feature. - Added integration tests for scaling compute node count. - Added integration tests for power cap enforcement by GoverningDecider. - Added integration tests that region entry is always preceded by region exit. - Added integration tests for sample rate frequency and jitter. - Added integration test for consistency between report and trace per region run-times. - Updated unit tests: - Added data driven unit test for derivative feature. - Added unit tests for PMPI wrappers. - Bug fixes: - Fixed documentation for installing from OBS yum and zypper repos. - Fixed some objects which were improperly using default copy constructor. - Fixed issue where unmarked regions (region 0) would report a progress value other than zero. - Fixed accounting issue when exiting a region and then immediately entering it again. - Fixed issue where RAPL values would be reset upon PlatformImp destruction (bad behavior for applications that change values and exit like geopmpolicy). - Fixed error handling in integration test script. - Fixed issue due to changing return type of json_object_array_length() for different versions of the json-c library. - Fixed issue preventing samples from being sent up tree beyond level 1. - Fixed issue with stale shared memory keys by deleting them at start up. - Fixed missing comm swap call in MPI_Gather() and MPI_Gatherv(): terminal error. - Fixed TreeCommunicator topology mapping logic. - Fixed issue with message vector sizing in TreeCommunicator. - Fixed missing ronn executable documentation build issue. - Fixed TreeCommunicator unit tests. - Fixed MPIInterface tests exposed by CLANG. - Fixed RAPL window MSR interface. - Fixed user control of GNU standard build variables when running make. - Fixed missing GEOPM annotation in some MPI wrappers in geopm_pmpi.c. - Fixed accounting for region entries. - Fixed issue by skipping TreeCommunicator tests on OpenMPI prior to 1.8.8 where one-sided comm was fixed.
Fri Nov 18 13:00:00 2016 Christopher M. Cantalupo v0.2.1 - Fix for accounting problem with nested MPI exits. - Fix to thread calculation in integration test to avoid hyper-threads. - Added script to loop over integration tests.
Fri Nov 11 13:00:00 2016 Christopher M. Cantalupo v0.2.0 - Renamed package to Global Extensible Open Power Manager. - Improved features, performance, documentation, testing and continuous integration. - Many bug fixes. - Modified CONTRIBUTING.md to reflect current work-flow. - Enabled Travis-CI on github repository. - Linked Travis-CI to Open SUSE Build Service for automation of multi-distro packaging and testing. - Removed explicit creation and destruction of geopm_prof_c objects from public interface. - Introduced new environment variable GEOPM_PROFILE to control profiling. - Introduced new environment variable GEOPM_DEBUG_ATTACH to enable attaching with a serial debugger. - Removed geopm_prof_print interface. - Removed \"-r\" command line option from geopmctl. - Made the power budget in the policy an average per-node budget instead of a whole job budget. - Modified report to include geopm version. - Added accounting in report for the number of entries into each region. - Added reporting of application totals. - MPI is no longer explicitly a region and MPI accounting is now part of application totals. - Refined how the geopm_prof_outer_sync() API works and renamed interface geopm_prof_epoch(). - The epoch start is no longer associated with application synchronization as geopm_prof_outer_sync was. - Epoch start marks the beginning of the outer most iterative algorithm of the application. - Added a --disable-doc configuration option for systems without ronn. - Changed default shmem key base from \"geopm_default\" to \"geopm-shm\". - Enabled GEOPM profiling without application modification through LD_PRELOAD. - Appended domain numbers to the trace file column headers. - Brought policy back to trace output. - Modified implementation to print warning if controller is not found by the Profile interface. - Enabled building in the SUSE environment. - Added an example that prints the geopm hash of any string. - Added support for Broadwell E Xeon and Knights Landing Xeon Phi platforms. - Added capability to save/restore MSR values before/after GEOPM runs. - Major improvements to signal handling and shutdown clean up. - Improvements to temporary file and shared memory management. - Added a suite of tutorials that steps through GEOPM features. - Posted video walk through of the GEOPM tutorials to YouTube. - Created the ideal \"model\" application for geopm shown in tutorial 6. - Added integration test infrastructure using python unittest and model application. - Added patches for GEOPM mark up to MiniFE and Nekbone benchmark source code. - Added support for batch MSR read through msr-safe ioctl interface. - Tuned decision making algorithms based on performance of several benchmarks. - Allowed GoverningDecider to \"unconverge.\" - Added separate throttling times for sampling and control. - Moved LockingHashTable template to a non-template implementation. - Added distinct entries in profile table for MPI and epoch events. - Switched to one sided communication (MPI_Put/MPI_Get) for passing samples up. - When a new policy is received at the leaf it is enforced immediately. - Modified implementation to unlink shared memory regions as soon as all users have attached. - Added an example which will check if geopm supports the current platform which is used to skip some tests. - Made check for supported platform more robust. - Removed all throw calls inside destructor methods. - Re-implemented application/controller handshake. - Moved default profile object into Singleton pattern. - Cleaned up factory registration pattern. - Added better error checking of user inputs. - Applied the write mask when writing to a MSR. - Abstracted the read_bandwidth signal in the PlatformImp classes. - Made PlatformImp objects abstract to signal topology. - Added death tests for the controller. - Removed use of MPI::Exception and all other MPI C++ constructs as they are deprecated. - Wrote an abstraction of the hwloc interface remove hwloc version specific implementation requirement. - Introduced XeonPlatformImp which Xeon platforms inherit from. - Proposed a class interface to abstract MPI usage by GEOPM\'s controller. - Fixed MSR read to mask off bits read from MSR beyond the overflow bit. - Fixed possible under/over power budget conditions. - Fixed a number of issues in report and trace output. - Fixed issue where hash table could overflow. - Fixed policy creation so that all the man page examples work correctly. - Fixed subtraction of MPI time from outer sync time. - Fixed accounting error in reported per region run-time. - Fixed msr write logic for multi-socket systems. - Fixed MSR save/restore. - Fixed usage of RAPL time window 1 and 2. - Fixed race condition: use MPI_Isend instead of MPI_Irsend. - Fixed RAPL interface logic. - Fixed geopm_time_add() to avoid overflowing nsec field. - Fixed frequency calculation in report. - Fixed the region entry count in report. - Fixed issues around MPI_Request usage in non-blocking MPI calls. - Fixed decider and accompanying logic. - Fixed issue related to sending new polices down when new decisions are made. - Fixed race condition in application/controller handshake. - Fixed shutdown logic in PMPI wrapper when controller is run as a pthread. - Fixed test executable so that non-matching test filters give an error. - Fixed bug in MSR restore from file related to overflow. - Fixed issue that occurs when using googlemock with gcc 6. - Fixed issues around incorrect use of PMPI wrappers. - Fixed a number of issues in the the PMPI wrappers. - Fixed PMPI wrappers to work with both the MPI-2 and MPI-3 standards. - Fixed missing dlclose() calls for dynamically opened shared objects. - Fixed issue related to launching the controller with pthread in PMPI wrapper. - Fixed multiple platform issues. - Fixed death test issue due to inconsistent SLURM exit status codes. - Fixed CPU indexing bug in PlatformImp derived classes. - Fixed typo in Environment.cpp which was breaking GEOPM_ERROR_AFFINITY_IGNORE environment variable. - Fixed the mask for getting frequency from IA32_PERF_STATUS. - Fixed broken download, switched to Fedora URL for downloading gmock 1.7.0.
Mon May 23 14:00:00 2016 Christopher M. Cantalupo v0.1.1 - Fixed race condition in geopm_comm_split_shared(). - Fixed geopmctl so that it works properly (error introduced with policy environment). - Fixed man page links and Makefile target. - Fixed automatic detection of Fortran MPI flags for compile and other build fixes. - Enable application marked with geopm_prof interface to run without controller. - Better consistency checking in global policy. - Enabled profile only use of geopm i.e. no power management (now the default). - Updated STATUS section in README. - Updated TODO list. - Converted plugin developers guide to LaTeX and included it in repository.
Mon May 9 14:00:00 2016 Christopher M. Cantalupo v0.1.0 - First geopm release with code complete runtime component. - Includes a wide range of bug fixes. - Introduced Fortran interface for application APIs. - Introduced globally scoped default profile object for geopm_prof_c interface. - Introduced application tracing capability. - Added NAS Fourier transform benchmark as an example. - Fixes for build system. - Fixes in the documentation. - Remove thread profiling \"helper APIs\" and replace with geopm_tprof_c interface. - Improvements in shutdown logic. - Shared memory key has default value and can be obtained from environment. - Explicit accounting for time spent in MPI calls through PMPI interface. - Enable nesting of MPI regions within user defined regions. - Remove geopm_prof_sample() interface. - Add some helper APIs for splitting MPI communicators. - Integrate with PMPI profiling interface to MPI. - Merges irregular application feedback with periodic hardware telemetry. - Moves some functionality between classes for better encapsulation. - Region information is no longer communicated between compute nodes. - Implemented plug-in selection through the Policy interface. - Handling of MSR counter overflow. - Implemented a basic decider for the leaf and the tree. - Refactor of Platform/PlatformImp implementation. - Updates to test infrastructure. - Added a synthetic benchmark with static imbalance injection.
Fri Dec 11 13:00:00 2015 Christopher M. Cantalupo v0.0.3 - Several bug fixes. - Update to user man pages. - Switch to ronn for man page generation (roff + html). - Major update to developer documentation with Doxygen. - Implemented passing of profile data from application to controller. - Implemented output of a summary profile report. - Implemented infrastructure for plug-in extensions. - Templatized CircularBuffer. - Extended tests, including addition of integration tests.
Fri Oct 16 14:00:00 2015 Christopher M. Cantalupo v0.0.2 - Initial release to . - Updates to man pages. - Support for static power modes. - Support for Platform abstraction. - Whitelist generation for MSR driver. - TreeCommunicator implementation to support hierarchy in MPI. - Build and test infrastructure (autotools, gtest, gmock).
Thu Oct 1 14:00:00 2015 Christopher M. Cantalupo v0.0.1 - Initial tag which includes initial draft of man pages only.
|
|
|