Upreckon

Welcome to the home page of Upreckon, a smart and easy-to-use automated program tester. Upreckon was designed specifically for testing solutions to olympic problems, such as those of the International Olympiad in Informatics, but it can be used to automatically test anything.

Creative Commons Licence

Warning: these licensing terms for Upreckon 2 are preliminary and may change at any moment. (They do apply to test.py 1.x though.) Upreckon was and is being developed by Oleg Oshmyan, also known as Chortos‑2. Upreckon is open-source (but not open-source as per the definition of the OSI) but not free software: in particular, it is forbidden to make changes to it and redistribute modified versions even if the original author is credited unless he gives a written permission to do so. (The author is fed up with countless forks in the free software world which make choices impossible to make and wants to avoid this with Upreckon. If you think you can improve Upreckon, please by all means send a patch to the author, and do ask for contribution rights if you think you can contribute over a longer period of time.) You are allowed to make changes in your local copy of Upreckon as long as you do not redistribute it. Except for this relaxation, Upreckon is technically licensed under the Creative Commons Attribution-NoDerivs 3.0 Unported Licence; if you are interested, do check the non-Legalese summary of the licence at the CC website. Your suggestions for a more appropriate licence are welcome.

Grab it! After extracting the tarball, run python setup.py install. The file to run will be installed as upreckon on POSIX systems and as <your Python installation directory>\Scripts\upreckon.cmd on Windows (the latter can be run from the command line without the .cmd extension).

If your Python is older than 3.3 and and you want Upreckon to read bzip2-compressed files in ZIP archives, you need a modified zipfile.py: pick the one corresponding to your version of Python and save it as zipfile.py in the same directory where test.py or upreckon resides. As an exception, Upreckon 2.00 (but not 2.01+) and pre-built Windows binaries of later versions of Upreckon include the modified zipfile.py, so you already have bzip2-in-ZIP support if you have these versions.

The latest release version of Upreckon is 2.04.1.

Alternatively, you can use the latest version from the Mercurial repository by running (cd upreckon && ./publish.sh) in a UNIX shell before python setup.py install.

Please do not link directly to the files provided on this page. While it is not forbidden to do so, in most cases links to this page are much more useful than direct links to the files. Besides, there are plenty of events that may happen and force existing direct links to stop working or to lead to incomplete or non-working versions of the tools they were supposed to lead to. For example, direct links to test.py 1.x have stopped working since Upreckon 2 was released.

In case you want to link directly to a section of this page, a fragment identifier has been assigned to each section. You can look up the identifiers in the source of the page. (In most cases, the source can be viewed by right-clicking on an empty place on the page and choosing View Source. When you get there, search for id=.)

The documentation on this page is incomplete. The information that is present should be accurate, although this is not guaranteed any more now that the page focuses on Upreckon 2.

If you use some API and want it frozen for the sake of backwards/forwards compatibility, do tell me! Otherwise I just assume no-one except myself uses any and change them in incompatible ways as development goes on.

Naming and Versioning

Before Upreckon 2.00, Upreckon was known as test.py; the last generally accessible version of test.py had the version number 1.20.3. You might wonder why I was using such a large minor version number while keeping the major version number so low. This is because I believe the major version number should be incremented on major changes, such as architecture re-designs and programming language switches. For example, Upreckon 2.00 was an almost complete re-write and had a design totally different from that of test.py 1.x.

Development of Upreckon is done in Mercurial (hg). If you stumble upon a computer with an hg snapshot installed as the main copy of Upreckon, it will report its version as the next expected version number with the changeset hash in parentheses, like this: 2.00.0 (hg 2b459f9743b4). Public releases do not report changeset hashes.

System Requirements

Usage

This list is (obviously) not yet complete, so please also take a look at the output of upreckon --help.

upreckon
Runs a full test of all programs against all test cases as specified in the configuration file.
upreckon -h
upreckon --help
Displays a built-in message on the usage of Upreckon.
upreckon --version
Displays the version number of the installed Upreckon.

Configuration

The configuration file is named testconf.py and must be a valid Python source code file. In all configuration files, line endings may be any of CR, LF and CRLF, and a single file may use multiple line ending styles.

There are two possible configuration types: single-problem and multi-problem. The difference in the meaning should be obvious from the names.

Configuration Variables
NameDefault valueTypeOptionality ConditionMeaning
problemsNoneiterable of stringsIf None, the configuration is single-problem. Otherwise, the configuration is multi-problem and this gives the names of the problems it contains.
testeein multi-problem configurations: './' + the current problem namestring or iterable of stringsThe path or argument list to the executable file of the program to assess. (See the subprocess documentation for type-dependent meaning.) The first/only item must be in POSIX path format and will be converted automatically if needed.
maxcputime0numberThe CPU time limit per test case in seconds. Zero means there is no limit.
maxwalltime0numberThe wall-clock time limit per test case in seconds. Zero means there is no limit.
maxmemory0numberThe memory limit (exact meaning varies from platform to platform) per test case in mebibytes. Zero means there is no limit.
usegroupsFalseanyThe truth value specifies whether test groups are used.
testsiterableThe identifiers of the test cases to test against. If usegroups is true, this must be an iterable of iterables.
dummies()iterableThe identifiers of the sample test cases to test against.
padtests0integerAll test case identifiers shorter than this number of characters will be left-padded with leading zeroes to this length (if an identifier starts with + or -, zeroes will be inserted after this character instead).
paddummies0integerSame as padtests but for sample test cases.
taskweight100numberThe weighted number of points awarded for a perfect solution of the problem.
pointmap{}mapping to numbersThe number of points awarded for a correct solution of every test case. The object must map test case identifiers or iterables of test case identifiers to points; in the latter case, the same amount of points is given for every identifier in the iterable (string keys are interpreted as sequences of characters). For a solution of any test case not found in this object, the value None is mapped to will be awarded. If None is not mapped to anything, 1 point will be awarded.
stdioFalseanyIf true, means that the assessed program uses standard input and output. If false, means that it uses file input and output.
innamewith the -s option: '%.in'stringoptional if stdio is trueThe pattern of the input file name used by the assessed program. If an output validator is used and this value is set and true, the input is copied under this name before launching the validator.
outnamewith the -s option: '%.out'stringoptional if stdio is true and tester is falseThe pattern of the output file name used by the assessed program. If an output validator is used, the output is copied under this name before launching the validator.
ansnamewith the -s option: '%.ans'stringoptional if tester is falseThe pattern of the input file name used by the assessed program. If an output validator is used and this value is true, the correct output is copied under this name before launching the validator.
testcaseinnamestringThe pattern of the test case input file names.
testcaseoutnamestringoptional if tester is true and ansname is falseThe pattern of the test case correct output file names.
dummyinname''stringThe pattern of the sample test case input file names. If false, it is replaced by the value of testcaseinname.
dummyoutname''stringoptional if tester is true and ansname is falseThe pattern of the sample test case correct output file names. If it is false but testcaseoutname is not optional, it is replaced by the value of testcaseoutname.
tester''string, iterable or false valueIf true, specifies the command line launching the output validator. If false, means that no output validator is used. The first/only item must be in POSIX path format and will be converted automatically if needed.
pauseon POSIX platforms: 'read -s -n 1'; on Windows: 'pause'stringoptional with the -x optionThe command line launching anything that potentially outputs something to the standard output and waits for the user to press any key. It is executed in the shell at the end of the entire testing process.

File Name Patterns

Several configuration variables represent file name patterns. The following characters have special meaning inside such patterns:

Sample Configuration File

maxcputime = 1
maxwalltime = 60
tests = range(1, 12)
padtests = 2
taskweight = 100
pointmap = {}
testee = './0620-maxdist'
stdio = True
inname = 'input.txt'
outname = 'output.txt'
testcaseinname = 'input.$'
testcaseoutname = 'answer.$'
tester = 'python', 'checker.py'
ansname = 'answer.txt'

Test Cases and Validation

Every test case is composed of an input file and either an output file or an output validation program. For a single problem, either all test cases have output files or all test cases have a single (common) output validation program.

In all test case output files, line endings may be any of CR, LF and CRLF, and a single file may use multiple line ending styles. In test case input files, all line endings should be LF for maximum portability, as Upreckon performs no line ending conversion on them.

By default, the assessed program’s output is compared line-by-line to the reference output, ignoring line endings but not ignoring any other whitespace. If any difference is found, including the absence of a trailing line break in the program’s output when there is one in the reference output, a wrong answer is scored. If the number of lines is the same and every line compares equal, a correct answer is scored.

In many problems, there is no single correct output, so this approach is not suitable. For every such problem, a separate program or script has to be written that checks whether the assessed program’s output is correct. Such programs and scripts are called output validators in this documentation. An output validator usually reads the original input data and the assessed program’s output from files, and often also the reference output. The verdict is reported through the validator’s exit code: if it is zero, a correct answer is scored, otherwise the answer is deemed wrong. The output validator may print additional information for the user to its own standard output, which will be shown by Upreckon as a note in parentheses after the ‘OK’ or ‘wrong answer’ verdict. (It is preferred that such notes thus start with a lowercase letter and do not end in a full stop.)

Test and Configuration Data Location

In the following lists, a special notation is used. When you see ‘archive:’, it means that the path to the right of the colon is a path inside an archive. Look below for more information about archives. taskname denotes the name of the assessed problem; for single-problem configurations, it is equal to a single full stop. Note that in file systems, some/path/./ is the same as some/path/ (in other words, the special file . is the directory in which it is located).

For single-problem configurations, the configuration file must be located in the tests directory, in the root of an archive or in the current working directory. The possible locations are considered in the order they are listed.

For multi-problem configurations, the global configuration file must follow the rules above, while problem-specific configuration files are searched for in more places. The places and their search order is the following:

  1. taskname/testconf.py
  2. taskname/tests/testconf.py
  3. taskname/archive:testconf.py
  4. taskname/archive:tests/testconf.py
  5. tests/taskname/testconf.py
  6. tests/testconf.py
  7. archive:taskname/testconf.py
  8. archive:taskname/tests/testconf.py
  9. archive:tests/testconf.py
  10. archive:testconf.py
  11. testconf.py

Test case input and correct output data is searched for in the following places in the following order:

  1. taskname/tests/filename
  2. taskname/archive:filename
  3. taskname/archive:tests/filename
  4. tests/taskname/filename
  5. tests/filename
  6. archive:taskname/filename
  7. archive:taskname/tests/filename
  8. archive:tests/filename
  9. archive:filename

Archive Support

The following archive types are supported:

When searching for a file in an archive, all the following archives are considered in the listed order. The search stops as soon as one is found that exists and contains the requested file.

  1. tests.tar
  2. tests.zip
  3. tests.tgz
  4. tests.tar.gz
  5. tests.tbz2
  6. tests.tar.bz2

Change Log and Release Notes

The change log and release notes are given here in order of descending version number, so the newest version is at the top.

2.04.1
Platform-independent bug fixes:
  • It is no longer possible to get duplicate test identifiers when match is 're'. (r246)
2.04.0
Platform-independent API changes:
  • Callable output validators may now give comments not only as strings but also as bytes. (r232233, r240)
Platform-independent bug fixes:
  • When dealing with external output validators, bytes are now passed through without being decoded, avoiding issues related to character encodings. (r232233, r240)
UNIX-specific feature changes:
  • When _unix is in use, pressing Escape now immediately cancels the current test even if Upreckon is still only unarchiving its input data. (r218)
2.03.2
Bug fixes:
  • On Python 3, testconf being read from a natively unsupported archive no longer causes a crash. (r228)
  • On Windows, another workaround for AssignProcessToJobObject failing with ERROR_ACCESS_DENIED has been added, as well as a proactive fix for that happening when Upreckon is itself assigned to a job on Windows 7 and older. (r226)
2.03.1
Platform-independent bug fixes:
  • Positional arguments on the command line are now handled correctly when match is 're'. (r222)
2.03.0
Platform-independent feature changes:
  • The match configuration variable has been added. If set to 're', the tests and dummies configuration variables are treated as regular expressions. If usegroups is true, tests should be a pair consisting of the regular expression and the regular expression group number by which to group test case identifiers into test groups. Regular expression group numbers in the regular expressions start at two. (r193, r196, r209)
  • The okexitcodemask configuration variable has been added. If non-zero, it denotes a bitmask applied to the exit code of the external output validator: the matching bits denote whether the verdict should be ‘OK’ (if non-zero) or ‘partly correct’/‘wrong answer’ (if zero); the remaining bits are processed as before. (r205)
  • The taskweight configuration variable can now be an iterable or a mapping. If an iterable, its values correspond to the values of the problems configuration variable; if a mapping, its keys are assumed to be problem names. (r201, r203, r207)
  • The force_zero_exitcode configuration variable can now be set on a per-problem basis. (r200)
  • The file search path has changed: .../testconf.py is now given precedence over .../tests/testconf.py, and within archives, a/tests/b/c is now given precedence over a/c. (r208)
  • ‘Incorrect’ is now removed from the beginning of output of external output validators. (r206)
Platform-independent API changes:
  • Callable output validators must now return three-tuples (number granted, bool correct, str comment) rather than two-tuples as before, and to return the verdict ‘wrong answer’, they must now explicitly raise upreckon.exceptions.WrongAnswer. These changes are backwards-incompatible. (r205)
UNIX-specific bug fixes:
  • Fixed several compilation errors and (hopefully) harmless run-time bugs. (r185186, r190191)
  • Keyboard interrupts now satisfy the prompt for a key press at the end of each Upreckon run even when _unix is in use. (r210)
2.02.0
Platform-independent feature changes:
  • The built-in output validator is now much faster. (r174)
  • The built-in output validator now ignores differences in line separators even if reference outputs are stored in tape archives. (r174)
  • The binary configuration variable has been added. If it is true, the built-in output validator does not ignore differences in line separators. (r174)
  • All input/output of test case data is now done in binary mode and is thus encoding- and binary-safe. (r174)
  • tests directories are now searched for within archives. (r178)
Platform-independent bug fixes:
  • The built-in output validator now correctly handles outputs shorter than reference outputs stored in archives. (r174, r180)
UNIX-specific bug fixes:
  • stdio=False no longer crashes Upreckon with _unix compiled. (r176)
2.01.2
Platform-independent bug fixes:
  • Callable values of the tester configuration variable no longer crash Upreckon (regression in 2.01.1). (r171)
2.01.1
Platform-independent changes:
  • The first/only element of the tester configuration variable is now treated as a POSIX path and automatically nativized (the same way that the first/only element of the testee configuration variable is handled). (r161)
Windows-specific work-arounds for foreign bugs:
  • Some (broken) output validators that were denied access to the input file are no longer denied access to it. (r160)
2.01.0
Platform-independent feature changes:
  • The --list-problems command-line option has been added. It prints all problem names in the current test configuration, one name per line. (r102)
  • Output-only problems are now supported. To make a problem output-only, set kind='outonly' in testconf. No input file names in testconf are required (or used). (r104, r145)
  • Python’s distutils are now used, and Upreckon’s modules are now in the package upreckon. (r146, r154)
  • The testee configuration variable and the upreckon.config.nativize_path function (usable within testconf) have been added. testee should be used instead of path and name; it specifies the string or iterable to be passed to subprocess.Popen except that its first/only element is a path in POSIX format and is automatically converted to the native path format at run-time (to prevent this, prefix it with slash-slash-colon). (r150, r156)
  • Positional arguments on the command line are now treated as specific test case identifiers to test, overriding the tests configuration variable and disabling sample tests and test groups. (r151152)
UNIX-specific changes:
  • An implementation of the unix module in C and C++ (named upreckon._unix) has been added. (r136137)
  • Exiting due to SIGINT is now properly reported to the parent process. (r138)
  • Lots of race conditions and other bugs have been fixed. (r118119, r123124, r126–129)
Windows-specific changes:
  • When maxwalltime is true but maxcputime is false, wall-clock time is now printed rather than CPU time. (r117)
  • On Windows NT 3.5+, the wall-clock time spent by the program being tested is now calculated from the values reported by GetProcessTimes. (r140)
  • The memory-limit-exceeded verdict is no longer given to testees that terminate with EXCEPTION_ACCESS_VIOLATION. (r122)
  • Upreckon should no longer crash when AssignProcessToJobObject fails with ERROR_ACCESS_DENIED. (r148, r157)
  • App Paths are now checked when launching output validators. (r155)
2.00.1
Platform-independent bug fixes:
  • A race condition resulting in times being printed twice in a row has been fixed. (r103 backported as r109)
  • Absent output files no longer crash Upreckon. (r105 backported as r110)
  • Multi-problem legacy configurations are now handled properly. (r107 backported as r111)
UNIX-specific bug fixes:
  • A crash that occurred when using output validators has been fixed. (r108 backported as r112)
2.00.0
Lorem ipsum. The tracker will be of more use to you.
1.20.3
Platform-independent bug fixes:
  • Another exception in the code handling ZIP archives has been eliminated.
Platform-dependent bug fixes (notably, Windows and OS/2 are affected):
  • Yet another exception in the code handling ZIP archives has been eliminated.
1.20.2
Platform-independent bug fixes:
  • ZIP archives with test case files in the root catalog no longer raise exceptions and instead work.
1.20.1
Platform-independent bug fixes:
  • Fractional numbers of weighted points are really no more rounded.
1.20.0
Platform-independent feature changes:
  • Fractional numbers of weighted points are no more rounded. (See the change log of 1.20.1.)
  • Whitespace is now removed from the start and the end of the output of output validators.
  • More locations of test and configuration data are now supported. Archived test and configuration data are now supported.
  • The zipfile.py file has been added to support bzip2 compression in ZIP archives. The standard zipfile Python module does not support bzip2.
  • The problems command-line parameter has been added to allow testing specific problems against a multi-problem configuration. problems is the longest list of problem names taken from the tasknames configuration variable that is leading the positional argument list. Only the problems listed will be tested. The configuration files of skipped problems are not read.
  • The dummies, dummyinname and dummyoutname configuration variables and the tuple syntax of the padwithzeroestolength configuration variable have been added to allow better support for sample test cases. The behaviour of the new variables is the same as that of tests, testcaseinname and testcaseoutname. The tuple syntax of padwithzeroestolength is this: if the variable is given as a 2-tuple (pair) of integers, the first integer gives the old-style padwithzeroestolength value and the second integer gives the same for sample test cases. If it is not but sample test cases are used, padwithzeroestolength is processed as if it was a tuple containing twice itself. If specific test case numbers are given on the command line, no sample test cases are included in the test. It is impossible to set on the command line the sample test cases to test against. It is impossible to exclude certain sample test cases.
  • In multi-problem configurations, configuration variables specified in the global configuration file now serve as defaults for problem-specific configuration variables.
  • In the value of the inname, outname, ansname, testcaseinname, testcaseoutname, dummyinname and dummyoutname configuration variables, the percent sign (%) is now a special character. It is replaced with the currently assessed problem’s name.
  • When the option -s or -m is set, defaults are now provided for the inname, outname and ansname configuration variables (taskname.in, taskname.out and taskname.ans respectively).
  • When the option -s or -m is set and the assessed program uses standard input/output, the correct output is now copied to the current working directory just like the original input and (for the -s option) the program’s output.
  • The -m option has been added. It emulates the -s option without actually performing any testing. As a consequence, the program’s output is not copied to the current working directory because there is no program’s output.
  • The -t option has been added. It adds an initial one-second delay and checks two time measurement functions available in Python to heuristically determine which of them is more precise and uses it to measure the running time of assessed programs.
  • When an output validator is used, the inname configuration variable is not obligatory now. This is because in some problems the original input is not needed to validate the output generated by the assessed program.
  • Before running the program to be assessed, the output file is now erased to check whether the program creates the output file. Note that this makes programs that try to append to the output file legal.
  • The devnull configuration variable is now obsolete. Instead, the value supplied by the Python standard library is used.
Platform-independent ideology changes:
  • Default values for configuration variables can now be relied upon (and are documented).
Windows-specific feature changes:
  • Time is now measured much more precisely.
Platform-independent bug fixes:
  • Fractional numbers of points (both weighted and unweighted) are now output correctly.
  • When the -c option is specified and the configuration is multi-problem, all problems’ input/output files are now erased. Before, only the first problem’s input/output files were erased.
  • Error messages about missing configuration variables should refer to them as ‘configuration variables’, not as ‘configuration names’. They did so as late as on 1.16.1 but sometime between 1.16.1 and 1.19.0 this behaviour was broken. Now they do so again.
  • When the assessed program broke the time limit, on some platforms it might not be terminated for long periods of time. It should now always be terminated immediately. (I have never experienced this problematic behaviour on Windows XP.)
  • If the assessed program does not create an output file or it cannot be accessed by test.py, test.py does not crash any more; instead, the program is given zero points for the test case.
  • Keyboard interrupts (Ctrl+C being pressed by the user) are never suppressed now. (Any other obscure interrupts that might have been suppressed in past should now not be. In technical terms, all bare except: clauses have been replaced with except Exception: clauses.)
Trivia:
  • The size of test.py has risen more than 1.99 times. Together with the newly added zipfile.py, the full test.py 1.20.0 package is more than 6.38 times larger than test.py 1.19.0.
  • The list of changes is the biggest I have ever had in any project.
  • This is the first version that has had many bugs preventing features from working as described. Previously, only a couple releases had bugs during development or after release (two if I am not mistaken), and then few of those (one if I am not mistaken ;) had more than one such bug.

Development Progress

For information on the progress of version 2, see the dedicated tracker page.