Coding Standards for Csound
Michael Gogins
8 May 2012
Summary
This document proposes guidelines for coding in Csound, starting with new code in Csound 6. (Don’t worry, I’m not proposing rewriting all the opcodes!) These guidelines are based on the experience of C and C++ programmers in critical software, high-performance software, and audio software. The main thrust is to code as clearly and readably as possible, to test code throughly with both static analysis and dynamic analysis tools, and to run unit tests for every feature. The rules boil down to using only safe parts of the languages, writing the clearest possible code, and handling all possible paths of execution through the code.
Objectives
The goals of coding, in decreasing order of importance, are:
- Clarity. Above all, the code should be as clear as possible. An intermediate programmer should be able to understand the problems the code solves, and how the code solves those problems, just by reading the code and without reading any comments. Every step that the computer follows in executing the code should be obvious. There should not be any hidden or unpredictable jumps. “Step” here refers to source code, not to generated machine code.
- Correctness. The code should be perfectly correct in that every valid input should produce the correct output, every invalid input should produce an informative error message, and it should not be possible to crash or hang the code.
- Speed. The code should run as fast as possible.
- Concision. The code should be as short and simple as possible.
- Self-sufficiency. The code should have the fewest possible external dependencies. If possible it should use only the facilities of the language itself. Failing that only system calls. Failing that only common, reliable external libraries.
- Simple toolchain. The toolchain should be as small as possible, use standard tools, and be as simple to configure as possible.
- Simple installation. It should be possible to install the software simply by copying it. Failing that, by running a standard package configuration tool.
- Self-documenting. The code should contain comments such that using Doxygen or some other code documentation tool will produce a complete and usable reference manual for the code (not a user’s guide).
These goals should not be controversial, but the order of the goals deserves some comment. Clarity comes first, even ahead of correctness and speed, because improving clarity tends to improve all the other other objectives, while falling short of the other objectives also tends to obscure clarity.
Rules
The following rules are proposed, in order of decreasing importance, as means of achieving the above goals:
- Coding rules
- Testing rules
- Building rules
- Packaging rules
- Formatting rules
The rules are, however, presented in the order they are considered in actual coding:
- Formatting rules
- Coding rules
- Building rules
- Testing rules
- Packaging rules
The background for these rules is found in C and C++ coding standards for critical systems, in which defects might cause death or catastrophe: automobile embedded systems software, avionics software, space systems software, and medical software. I have consulted the following coding standards:
MISRA-C and MISRA-C++, The Motor Industry Software Reliability Association, coding standards for C and C++ for automobile embedded ststems software (http://www.misra.org.uk/). These standards have been hugely influential.
JOINT STRIKE FIGHTER AIR VEHICLE C++ CODING STANDARDS FOR THE SYSTEM DEVELOPMENT AND DEMONSTRATION PROGRAM, Document Number 2RDU00001 Rev C, December 2005 (http://www2.research.att.com/~bs/JSF-AV-rules.pdf). This contains input from Bjarne Stroustrup, the author of the C++ programming language, among others.
C and C++ Coding Standards, European Space Agency, (https://wiki.ucar.edu/download/attachments/25039241/european_space_agency_standards.pdf?version=1&modificationDate=1214000706000).
JPL Institutional Coding Standard for the C Programming Language, (http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf).
The Google C++ Style Guide (http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml).
Optimizing Software in C++ (most advice applies just as well to C), http://www.agner.org/optimize/optimizing_cpp.pdf
These institutions have established practices for writing defect-free software in C and C++, which are “dangerous” languages with sharp edges. Defects are avoided by using a restricted subset of the language, following rules for extreme clarity, using static analysis tools on code, using dynamic analysis tools on running programs, and performing extensive unit tests.
Practices from large-scale, high-performance computing, especially physics data analysis software, also are relevant. And, of course, practices from real-time software and, especially, other audio and music software are relevant.
Coding standards for critical software have not, so far, extensively considered issues arising from multiple threads. Because Csound is multi-threaded, I have consulted various sources for best practices in concurrent programming. The JPL standard is a coding standard that does consider threading issues.
History shows that large software systems may in fact be written in C or C++ that is effectively free of defects. If this is not happening with your code, you are doing something wrong.
Rules are bulleted. The words must, should, and may are used with intent and are boldface.
Formatting
The main objective is to enforce clarity and to make it as easy as possible to read and analyze the code.
- All code must be written so as to be readable when printed out on letter sized paper or the common user’s computer screen.
- Names must be unambiguous.
- Names must not be abbreviations or acronyms, with the exception of loop indices and array indices.
- Functions must be declared, including return types, on one line of code if possible. If that is not possible, each parameter except the first should come on a separate line.
- Code within function bodies should not include blank lines. The more code there is on the page, other things being equal, the easier it is to read.
- Function bodies should fit on one page of printout or the common user’s computer screen.
- Comments must be complete English sentences that begin with an initial capital letter and end with a period.
- Comments must not be on the same lines as code.
- Indendentation must be done with spaces not tabs at 4 spaces per indent.
- All code, before being checked into the source repository, must be be run through a code formatting tool such as Artistic Style (astyle) using astyle’s format A10, which conforms to Linux style but enforces braces on all blocks of code.
Coding
The main objective is to enforce the use of only a safe and accepted subset of the language, and to avoid all programming practices that are known to risk contradictions, ambiguities, and defects. The rules also are designed to make it easier to use static analysis tools on the code.
The Meta-Rules
- All violations of the “must” coding rules must be documented by comments in the code.
- All source code, build files, and anything else required to build and distribute Csound must be maintained in a documented source code version control system, such as git.
Clarity
- C code must follow ISO/IEC 9899, a.k.a C99.
- But the following features of C99 must not be used:
- Signals
- setjmp/longjmp
- GNU extensions
- Pragmas (but OpenMP pragmas are allowed).
- Trigraphs
- All implementation defined or unspecified features of the language.
- C++ code must follow ISO/IEC 14882:2011, a.k.a. C++11.
- But the following features of C++11 must not be used:
- Exceptions
- Run-time type information
- GNU extensions
- Pragmas (but OpenMP pragmas are allowed).
- All implementation defined or unspecified features of the language.
- Types must be declared in only one place in code.
- Types must be defined in only one place in code.
- Types must not be encoded into names. By accident, such types may be encoded differently than they are actually declared. This rule prevents such contradictions.
- Physical or musical units must not be encoded into names. By accident, such units may become from the units actually declared as part of the type in code. This rule prevents contradictions between units in names and units in types.
- External types must never be redeclared or redefined.
- Macros must never be undefined.
- The extern keyword must only be used to declare the binary linkage of code. For example, “extern int foo();” is forbidden but “extern “C” int foo();” is permitted.
- Comments must be used but only to explain the purpose of each public type or function; such comments must be in Doxygen format and be clear enough to serve as the text of the manual reference for that code. This rule prevents public functions and types from going undocumented.
- Comments must not be used to explain the purpose of code. By accident, comments that explain the logic of code may come to differ from the actual logic of the code. This rule prevents such contradictions.
- #include <> must be used.
- #include “” must not be used. The reason is that different compilers implement it differently.
Concision and Minimal Scope
- Scope should be as limited as possible. E.g. all functions and data structures should be declared static or at file scope unless they are part of a public API.
- Macros must be used to prevent multiple inclusion of header files. Such macros must be named CSOUND_DIR_FILE_EXTENSION, e.g. CSOUND_INCLUDE_CSOUND_H.
- “C” header files must define nothing.
- “C” header files must declare as little as possible.
- “C” header files must #include as few other header files as possible.
- C++ header files should declare as little as possible.
- C++ header files should #include as few other header files as possible. Forward declarations should be used whenever possible.
- C++ header files may define static objects for access through singleton functions.
- Public C++ header files must define nothing.
- Public C++ header files must declare only types, functions including factory functions, and pure abstract classes.
- Implementation C++ header files should define as many function bodies within class declarations as possible. The reasons for this are both readability and higher performance.
- Names in an outer scope must not be redefined in an inner scope.
- All variables must be initialized when they are first defined. This includes setting all memory contained in C structs to zeros before use.
- In C++, all variables must be declared and defined just before they are used.
Correctness
- Strong static typing should be used.
- Physical and musical units should if possible be declared as part of the type.
- Elementary data types must be declared with sized types, e.g. using stdint.h, e.g. int32_t not int.
- Macros must not be used except for preventing multiple inclusion of header files, or for conditional compilation. E.g. enums must be used to declare constant numerical values. The reason for this rule is that macros obscure the actual path of the code.
- In a list of enums, the first one and only the first one must be defined with =. By accident, the same value may come to be assigned to several different enums. This rule prevents such contradictions.
- varargs must not be used with the exception of Csound’s message printing functions. By accident, the user of a varargs function may come to pass parameters in different order than the function expects. This rule prevents such contradictions.
- Functions must not return values unless those values are used.
- Functions must have only one point of exit.
- Recursive functions must not be used. By accident, recursion may not end and may cause the stack to exhaust memory. This rule prevents such errors.
- Global static data must not be used.
- All functions must be re-entrant.
- Functions must not have side effects in memory, aside from possibly changing the value of reference parameters.
- Pointer arithmetic must not be used. Array arithmetic is less obscure, and at least as fast.
- Aliases must not be used. Aliases may create the impression of there being two objects, when in fact there is only one.
- All loops must have a pre-determined number of iterations, with the sole exception of the event loop in an event-driven program.
- Everything that can be declared const, must be declared const.
- All function return values must either be checked for errors or cast to (void). The purpose is to make it crystal clear when the return value matters, and to handle it whenever it does matter.
- All function parameters must be checked for correctness before the function body executes. The parameters should be checked with compile-time assertions and if that is not possible, they must be checked with runtime assertions or cause an error return.
- The <stdio.h>, <cstdio>, and <iostream> headers must not be used except for formatting strings and messages. These headers are not as efficient as system calls and are more subject to errors.
- The heap must not be used while Csound is performing (unless, for some specific purpose, that is not possible).
- Instead, Csound must preallocate a large enough block of memory for the performance and use that as an internal heap without deallocation until the beginning of the next performance.
Speed
As it happens, the rules for correctness, especially regarding minimal scope and the heap, also help with speed. The following rules go further.
- Regarding speed, guidelines are not enough.
- To identify sections of code that might benefit from speedups, you must use a profiler such as valgrind callgrind.
- For critical loops, you should perform controlled experiments with repeated, high-precision timing measurements.
- Standard C type for loops and C++ standard library iterator-controlled loops should be used, as the generated code is highly optimized and can usually automatically be vectorized.
- Elementary data types for computations should be declared as sized types optimal for calculations, e.g. int_fast32_t.
- C++ types should be initialized by constructors instead of by assignment, e.g. std::string mystring(“hi”) not std::string mystring = “hi”.
- As much code as possible should be defined as inline. This can easily be done in C++ by defining functions in header files.
- In C and C++, arrays and std::vector should be used instead of linked lists, even for queues and fifos and such. This will foster both cache coherence and faster iteration.
- Static objects, when their use is impossible to avoid, should be defined in the source files that will use them. This will foster cache coherence.
- Compiler intrinsics should be used.
- Whole program optimization should be used.
- Since exceptions must not be used, the stack pointer must not be used.
- Functions that call each other should be near each other in the source code. This will make caching more efficient.
- In C++, the standard library containers (STL) should use Csound’s allocator, not the heap.
- Csound’s internal heap and any other memory pools must be aligned on 32 bit boundaries. This will help with automatic vectorization.
- As much of Csound as possible should be linked statically.
Concurrency
- Tasks that can be speeded up by running parallel, should be enabled to run in parallel at the highest possible level of granularity.
- Selection of such tasks should be done by experiment.
- Shared data must be owned by one single thread.
- Parallel tasks and for loops should be implemented using OpenMP pragmas.
- Background threads and event-handling threads should be implemented using pthreads.
- Threads and thread pools for producer-consumer systems connected by queues may be implemented with either OpenMP or pthreads.
- Writing to shared data should be performed via inter-process communication mechanisms, such as lock-free queues.
- However, the reduce step of OpenMP parallel regions must be implemented using OpenMP pragmas.
- If nested monitors are used to guard shared data, locking and unlocking must always be performed in the same order.
External Libraries
- Csound must use third party, open source libraries when they provide superior functionality or a significant savings in development time.
- Csound should also minimize the use of such libraries. Only the following libraries are permitted:
- libsndfile
- PortAudio
- PortMidi
- PortSMF
- Jack
- VST SDK
- eigen
- STK
- ATSA
- Qt
- DSSI
Building
- The source code tree should be as simple as possible:
- src for all Csound engine source code and all non-public header files.
- include for all public Csound header files.
- src/modules for all dynamically loaded Csound modules (plugins, drivers, etc.). Header files for these modules should not be used.
- doc for all Csound documentation.
- examples for all Csound example compositions.
- The GNU toolchain (gcc, g++, ar, ld, gdb, etc.) should be the primary toolchain.
- However, autotools must not be used.
- CMake must be used as the overall build tool.
- One CMake project must be used to build all of Csound, run all tests, generate documentation, and package Csound for distribution.
- All configuration decisions must be documented and controlled in one and only one top-level config.h.
- SWIG may be used to generate language bindings.
- The main body of reference documentation must be generated using Doxygen. By accident, the comments in code and the reference documentation may come to differ. This rule prevents such contradictions.
- The compiler warning level must be set as high as possible.
- The code must compile without any errors or warnings.
- All builds must be -O3 -g. There must not be separate builds for debugging and release. By accident, the generated code may follow a different path in a debug build than it does in a release build (this has happened to me many times). This rule prevents such contradictions.
Testing
Before checking in, all code must be subjected to the following tests:
- All code must pass static analysis with Cppcheck or Clang.
- All false positives in static analysis must be resolved by rewriting the code.
- Unit tests must be checked in along with source code and in the same directory.
- All code must pass valgrind memcheck, while running unit tests and while running a large actual composition.
- All code must pass valgrind helgrind for threading issues and data races, while running unit tests and while running a large actual composition..
- All code must pass valgrind callgrind for performance profiling and cache usage, while running unit tests and while running a large actual composition.
- All code must have and must pass unit tests. Such tests may, however, be written at a fairly high level of usage (e.g. a test piece of music).
Packaging
- All installers and delivery packages must be build targets.
- All unit tests must be build targets.
- The build system must build all targets, including documentation and examples, in a local prefix tree.
- It must be possible to install Csound by unzipping an archive of the prefix tree in the user’s home directly, and simply running it.
- Installers and packages should provide users a choice of more or less limited feature sets.