Using cgroups allows throttling the indexer CPU usage.

The main reason to do this would be to prevent the CPU from heating and triggering a fan spinup when the noise would be inconvenient.

There are two main cases depending if systemd is in use on the system or not.

With systemd

When systemd is in use on a system it mostly takes over the cgroup hierarchy (typically mounted on /sys/fs/cgroup), and provides easy methods to use cgroup functionality.

A normal Recoll Linux installation includes two systemd unit files, typically located in /usr/share/recoll/examples and described in the Recoll manual

Very recent versions of these files include the following comments in the [Service] section:

# CPU usage control examples. If you set this low, you probably also want to configure the
# recollindex multithreading parameters in recoll.conf. Use thrTCounts to limit the number of
# threads, or disable multithreading altogether with thrQSizes = -1 -1 -1
# CPUQuota=40% # the indexer will use no more than 40% of one CPU core.
# CPUQuota=100% # the indexer will use no more than 100% of one CPU core.
# CPUQuota=250% # the indexer will use no more than the capacity of two and half CPU cores.

Including a CPUQuota line in the unit file [Service] section is all you need to control the CPU usage if you are starting recollindex this way.

Another possible approach, avoiding the explicit creation of a unit file is to use systemd-run. For example as:

systemd-run --user --wait --property=CPUQuota=10% recollindex -z

systemd-run has numerous other options, check its manual page.

Using libcroup and cgroup-tools (optionally in conjunction with systemd)

Note
tested on Linux Mint (versions 20 through to 22) Cinnamon.

A kernel switch that may be required

One might need to add cgroup_enable=cpu to one’s book string. (There are instructions on how to do that here. The program 'Grub-Customiser' can do it too.)

Install required packages

sudo apt install cgroup-tools

which also installs libcgroup2.

Set up a configuration file

I believe that the following is a prerequisite. Take the template for the global configuration file and turn it into an actual global configuration file (though we will not need to alter the resulting file). To do that:

sudo cp /usr/share/doc/cgroup-tools/examples/cgred.conf /etc/

Create two other files

There are two files one must get involved with. Those files are respectively the file that specifies one’s 'groups' and the file that allocates programs to groups. We need first of all to create those files. Do that as follows.

# The file that specifies groups.
sudo touch /etc/cgconfig.conf
# The file that allocates programs to groups.
sudo touch /etc/cgrules.conf

One can edit each file with one’s favourite editor, so long as the editor can write to a root-owned file. (So, one needs to run the editor as root, or the editor needs the capacity to prompt for a password.)

Create some group(s) by editing the groups file (/etc/cgconfig.conf)

group app/indexer {
  cpu {
    cpu.max = "1000 2500";
  }
}

The string 'indexer' is arbitrary. (Perhaps the same goes even for 'app'.)

The value cpu.max is a double one - it has the form: "x <space> y" with the quotation marks. x is the number of milliseconds, within y milliseconds, that the processes at issue can use. Neither x nor y can be less than 1000.

(In principle the cgconfig allows one to set a value called 'io.weight'. However, I have been unable to get that to work.)

Allocate some programme(s) to groups by editing the rules file

(/etc/cgrules.conf)

Examples:

*:recoll							cpu		app/indexer/
"*:processNameThatHasA space"       cpu     app/indexer/

Note: the behaviour for process-names that contain space is set to change. (See here. The change is meant to be backwards-compatible, though)

Apply the rules

sudo cgconfigparser -l /etc/cgconfig.conf && sudo cgrulesengd

The command might yield cryptic error messages. 'No such file or directory' seems to indicate an invalid field name within either /etc/cgconfig.conf or /etc/cgrules.conf. 'Invalid argument' seems to indicate an invalid value assignment within either file; note that, within cgconfig.conf, assignments of values to variables must end with a semi-colon (as in the example above).

Get the rules applied at boot

You will need to have a way to execute the above cgconfigparser and cgrulesengd when the machine starts up. This will of course depend on your specific init system, and can’t be described here - except to say the following. On Linux Mint (and all other distributions that use systemd by default?) the rules get applied automatically at boot.

Without systemd and libcgroup, using a direct script

Depending on your actual system, the cgroup hierarchy may be mounted at boot time or not. For example, at this time, on devuan, cgroups are mounted at boot time, and processes are placed in session cgroups. Alpine does not mount the cgroups tree by default. I did not look at others.

The following is an example script for devuan, executing a command with a given CPU percentage, placed in its own cgroup under /sys/fs/cgroup/user-xxx.slice/commandname

This does not aspire to be an example of shell programming, but it appears to do the job and does not wreak havoc on the system as far as I can see:

#!/bin/sh
# ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
#
# Starting a process in a separate cgroup to control its CPU usage, on
# a devuan machine, using cgroup v2
#
# We need to do some stuff as root, there does not seem to be any way
# around it (without delegating the work to systemd).
#
# this is just a test script, read and understand before using it.

Usage()
{
    echo 'Usage: cpumax.sh cpumaxpercent command [args]' 1>&2
    exit 1
}

test $# -ge 2 || Usage


# CPU share as a percentage. This can be > 100 (to use more than one core)
cpushare=$1
shift
# Rest of the command line: command to run
cmd=$*

# TBD: Arbitrary name for my cgroup, using the command name here
executable=`echo $cmd | cut -d ' ' -f 1`
cgroupname=`basename $executable`
echo $cgroupname

myuid=`id -u`
echo Using uid $myuid

# Existing cgroups on devuan appear to be like /sys/fs/cgroup/n where n is an integer,
# looks like a sequential session number. Not clear what would cause a cleanup, the groups
# keep accumulating for subsequent ssh sessions (e.g. while true;rsync...)

# We need to enable the cpu controller in the root. This is not on by default on devuan
echo "+cpu" | sudo tee /sys/fs/cgroup/cgroup.subtree_control > /dev/null


# Create a cgroup for this user. This does not change the fact that
# we'll need root to move the process, done just to keep things a little tidier.
usercgroup=/sys/fs/cgroup/user-${myuid}.slice
test -d $usercgroup || sudo mkdir $usercgroup || exit 1
# Activate the cpu controller in there
echo "+cpu" | sudo tee $usercgroup/cgroup.subtree_control > /dev/null

# Cgroup for the command
mycgroup=$usercgroup/$cgroupname
test -d $mycgroup || sudo mkdir $mycgroup || exit 1
# Enable the cpu controller
echo "+cpu" | sudo tee $mycgroup/cgroup.subtree_control > /dev/null

# Move this process to the new cgroup
echo Writing my PID $$ into $mycgroup/cgroup.procs
echo $$ | sudo tee $mycgroup/cgroup.procs > /dev/null ||  exit 1

# Compute a value to write to cpu.max
pcshare=`expr 100000 '*' ${cpushare} / 100`
# and write it
echo writing $pcshare 100000 to $mycgroup/cpu.max
echo $pcshare 100000 | sudo tee $mycgroup/cpu.max > /dev/null || exit 1

# Start the command.
$cmd