Herodotus/How to try herodotos

From ALT Linux Wiki
< Herodotus
Revision as of 23:23, 16 September 2020 by Imz (talk | contribs) (note that /etc/resolv.conf could be a symlink)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

One can install the herodotos tool (for p8) from the task mentioned in Herodotus#herodotos_in_p8: task 257760. (task 243195 has still an unmet dependency on gumtree, which is optional.)

If you want to try herodotos, try to reproduce the authors' work https://github.com/coccinelle/faults-in-linux . (It is more recent; the older work http://coccinelle.lip6.fr/papers/aosd10.pdf with their data and configuration is not suitable for the current herodotos 0.8+ version.)

I've adapted their herodotos config files and made it a Gear repo: http://git.altlinux.org/people/imz/public/faults-in-Linux.git , so that one can easily pass it to hasher and do the processing in an isolated, easily reproducible hasher environment.

  • First, prepare: clone my repo and and set up the sources for APT:
$ git clone --depth=20 git://git.altlinux.org/people/imz/public/faults-in-Linux.git
$ cd faults-in-Linux
$ apt-repo --hsh-apt-config=/home/imz/.hasher/p8/apt.conf add 257760
Here is what the APT sources config for the hasher should be like (and our current working dir):
$ apt-repo --hsh-apt-config=/home/imz/.hasher/p8/apt.conf
rpm [updates] file:/ALT/p8 x86_64 classic
rpm [updates] file:/ALT/p8 noarch classic
rpm http://git.altlinux.org repo/257760/x86_64 task
$ pwd
  • Then, we execute the authors' processing rules (under the control of my .gear/faults-in-Linux.spec-file from the master branch; it automatically gets and checks out various revisions of the linux sources (so, you must have enough space to hold it):
  • Initialize the hasher chroot.
$ hsh --apt-config=/home/imz/.hasher/p8/apt.conf --without-stuff --ini
  • Prepare for networking in hasher. (The study.hc herodotos config in this example refers to our kernel Git repository hosted on the network. And it'll have to resolve the host name.)
$ hsh-run --root -- sh -c 'cat >/etc/resolv.conf' </etc/resolv.conf
$ export share_network=1
(Instead of statically copying resolv.conf, one could bind the one from the host system, i.e., $(realpath /etc/resolv.conf) to avoid symlinks, with mount --rbind and mount --make-rslave, and mount -o remount,ro for extra safety additionally to hasher's UID switching. One could even try to configure this in /etc/hasher-priv/fstab if hasher-priv understood all these mount options.)
  • Optionally, test that resolving the host name works.
$ hsh-install telnet
$ hsh-run -- telnet git.altlinux.org 80
  • Finally, run herodotos, which will download the sources (the Linux kernel in this example), unpack different revisions, cache them, and analyze.
$ gear --hasher -- hsh-rebuild 2>&1 | tee hsh.log.1
It stops after the step of applying the static analyzer (coccinelle) to each version of the sources (linux). The results are saved at /usr/src/HERODOTOS/ (inside hasher). I've copied them and saved in commit ad458b0c2 in the EXPERI/imz2/apply-analyzer-results branch, so that you can look and get an idea what they look like:
  • the individual per-version *.orig.org files.

/usr/src/HERODOTOS/ is used as the place to cache the analyzed sources and to save the (intermediate and final) results, so it won't be cleaned if you run gear --hasher | hsh-rebuild again (after editing the Git repo with the Makefiles, configs etc). (TODO: Unfortunately, the automatically filled faults/.projects_study.hc file is not relocatable in a similar manner.)

  • The next step (correlation of the warnings between versions by herodotos) is to be run by us manually (because I wanted to have a possibility to first commit the results of the previous step):
hsh-shell --mount=/proc,/dev/pts
cd /usr/src/RPM/BUILD/faults-in-Linux-20181023/faults/
make correl
or as a single command:
hsh-run --mount=/proc -- sh -c 'cd /usr/src/RPM/BUILD/faults-in-Linux-20181023/faults/ && make correl'
I saved the results in commit c3f5e56dd7e in the EXPERI/imz2/correl-gnudiff-results branch, so that you can look and get an idea what they look like:
  • some non-empty *.correl.org files with undecided possible correlations (marked as TODO);
  • the *.new.org files with merged warnings from all versions. It is to be decided whether each of them (marked as TODO initially) is a real error or a false warning.

(In this example, I made herodotos use the --diff gnudiff option, because the default better --diff hybrid requires gumtree and doesn't work correctly if it is absent.)

A follow-up scenario would be to first mark some warnings as checked and then add another version of the project into consideration (by editing the pattern in faults/study.hc.base) and see how the warnings concerning the new version are merged with the marks for the old versions. Let's explore this.