List, Query, Manipulate System Processes

lifecycle Travis build status AppVeyor build status CRAN status CRAN RStudio mirror downloads Coverage status

Introduction

ps implement an API to query and manipulate system processes. Most of its code is based on the psutil Python package.

Installation

You can install the released version of ps from CRAN with:

Supported platforms

ps currently supports Windows (from Vista), macOS and Linux systems. On unsupported platforms the package can be installed and loaded, but all of its functions fail with an error of class "not_implemented".

Listing all processes

ps_pids() returns all process ids on the system. This can be useful to iterate over all processes.

##  [1]  0  1 48 49 51 52 53 55 58 59 60 61 66 70 72 73 78 79 81 83

ps() returns a data frame (tibble if you have the tibble package available), with data about each process. It contains a handle to each process, in the ps_handle column, you can use these to perform more queries on the processes.

ps()
## # A tibble: 467 x 11
##      pid  ppid name   username status    user  system     rss     vms created             ps_handle
##  * <int> <int> <chr>  <chr>    <chr>    <dbl>   <dbl>   <dbl>   <dbl> <dttm>              <I(list)>
##  1 15618 15603 R      gaborcs… runni…  0.743   0.138   1.20e8  2.73e9 2018-11-07 09:38:47 <S3: ps_…
##  2 15603 69354 R      gaborcs… runni…  1.80    0.227   1.21e8  2.69e9 2018-11-07 09:38:42 <S3: ps_…
##  3 15404     1 quick… gaborcs… runni…  0.0578  0.0362  2.86e7  3.09e9 2018-11-07 09:37:38 <S3: ps_…
##  4 15403  9730 Googl… gaborcs… runni…  0.180   0.0634  6.72e7  3.00e9 2018-11-07 09:36:59 <S3: ps_…
##  5 15402  9730 Googl… gaborcs… runni…  0.0836  0.0452  4.94e7  2.97e9 2018-11-07 09:36:58 <S3: ps_…
##  6 15364 15363 docke… gaborcs… runni…  2.05    0.277   3.06e7  2.58e9 2018-11-07 09:34:41 <S3: ps_…
##  7 15363 89645 docke… gaborcs… runni…  0.0482  0.0354  7.84e6  2.53e9 2018-11-07 09:34:41 <S3: ps_…
##  8 15278  9730 Googl… gaborcs… runni…  4.13    1.32    4.80e7  2.80e9 2018-11-07 09:33:51 <S3: ps_…
##  9 15274  9730 Googl… gaborcs… runni… 13.6     1.99    2.29e8  3.22e9 2018-11-07 09:33:43 <S3: ps_…
## 10 15272     1 backu… root     runni… NA      NA      NA      NA      2018-11-07 09:33:24 <S3: ps_…
## # ... with 457 more rows

Process API

This is a short summary of the API. Please see the documentation of the various methods for details, in particular regarding handles to finished processes and pid reuse. See also “Finished and zombie processes” and “pid reuse” below.

ps_handle(pid) creates a process handle for the supplied process id. If pid is omitted, a handle to the calling process is returned:

## <ps::ps_handle> PID=15618, NAME=R, AT=2018-11-07 09:38:47

Query functions

ps_pid(p) returns the pid of the process.

## [1] 15618

ps_create_time() returns the creation time of the process (according to the OS).

## [1] "2018-11-07 09:38:47 GMT"

The process id and the creation time uniquely identify a process in a system. ps uses them to make sure that it reports information about, and manipulates the correct process.

ps_is_running(p) returns whether p is still running. It handles pid reuse safely.

## [1] TRUE

ps_ppid(p) returns the pid of the parent of p.

## [1] 15603

ps_parent(p) returns a process handle to the parent process of p.

## <ps::ps_handle> PID=15603, NAME=R, AT=2018-11-07 09:38:42

ps_name(p) returns the name of the program p is running.

## [1] "R"

ps_exe(p) returns the full path to the executable the p is running.

## [1] "/Library/Frameworks/R.framework/Versions/3.5/Resources/bin/exec/R"

ps_cmdline(p) returns the command line (executable and arguments) of p.

## [1] "/Library/Frameworks/R.framework/Resources/bin/exec/R"                         
## [2] "--slave"                                                                      
## [3] "--no-save"                                                                    
## [4] "--no-restore"                                                                 
## [5] "-f"                                                                           
## [6] "/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T//RtmppUEiNg/file3cf33c11b009"

ps_status(p) returns the status of the process. Possible values are OS dependent, but typically there is "running" and "stopped".

## [1] "running"

ps_username(p) returns the name of the user the process belongs to.

## [1] "gaborcsardi"

ps_uids(p) and ps_gids(p) return the real, effective and saved user ids of the process. They are only implemented on POSIX systems.

if (ps_os_type()[["POSIX"]]) ps_uids(p)
##      real effective     saved 
##       501       501       501
if (ps_os_type()[["POSIX"]]) ps_gids(p)
##      real effective     saved 
##        20        20        20

ps_cwd(p) returns the current working directory of the process.

## [1] "/Users/gaborcsardi/works/ps"

ps_terminal(p) returns the name of the terminal of the process, if any. For processes without a terminal, and on Windows it returns NA_character_.

## [1] NA

ps_environ(p) returns the environment variables of the process. ps_environ_raw(p) does the same, in a different form. Typically they reflect the environment variables at the start of the process.

ps_environ(p)[c("TERM", "USER", "SHELL", "R_HOME")]
## TERM                          xterm-256color
## USER                          gaborcsardi
## SHELL                         /bin/zsh
## R_HOME                        /Library/Frameworks/R.framework/Resources

ps_num_threads(p) returns the current number of threads of the process.

## [1] 4

ps_cpu_times(p) returns the CPU times of the process, similarly to proc.time().

##            user          system    childen_user children_system 
##       0.9372962       0.1551544              NA              NA

ps_memory_info(p) returns memory usage information. See the manual for details.

##        rss        vms    pfaults    pageins 
##  124092416 2703060992      31608        325

ps_children(p) lists all child processes (potentially recuirsively) of the current process.

## [[1]]
## <ps::ps_handle> PID=15618, NAME=R, AT=2018-11-07 09:38:47

ps_num_fds(p) returns the number of open file descriptors (handles on Windows):

## [1] 4
## [1] 5

ps_open_files(p) lists all open files:

## # A tibble: 2 x 2
##      fd path                                                                                
##   <int> <chr>                                                                               
## 1     0 /dev/null                                                                           
## 2     3 /private/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T/RtmppUEiNg/file3cf33c11b009
## # A tibble: 3 x 2
##      fd path                                                                                
##   <int> <chr>                                                                               
## 1     0 /dev/null                                                                           
## 2     3 /private/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T/RtmppUEiNg/file3cf33c11b009
## 3     4 /private/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T/Rtmp5G3HjH/file3d0222d724a8
## # A tibble: 2 x 2
##      fd path                                                                                
##   <int> <chr>                                                                               
## 1     0 /dev/null                                                                           
## 2     3 /private/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T/RtmppUEiNg/file3cf33c11b009

Process manipulation

ps_suspend(p) suspends (stops) the process. On POSIX it sends a SIGSTOP signal. On Windows it stops all threads.

ps_resume(p) resumes the process. On POSIX it sends a SIGCONT signal. On Windows it resumes all stopped threads.

ps_send_signal(p) sends a signal to the process. It is implemented on POSIX systems only. It makes an effort to work around pid reuse.

ps_terminate(p) send SIGTERM to the process. On POSIX systems only.

ps_kill(p) terminates the process. Sends SIGKILL on POSIX systems, uses TerminateProcess() on Windows. It make an effort to work around pid reuse.

ps_interrupt(p) interrupts a process. It sends a SIGINT signal on POSIX systems, and it can send a CTRL+C or a CTRL+BREAK event on Windows.

Finished and zombie processes

ps handles finished and Zombie processes as much as possible.

The essential ps_pid(), ps_create_time(), ps_is_running() functions and the format() and print() methods work for all processes, including finished and zombie processes. Other functions fail with an error of class "no_such_process" for finished processes.

The ps_ppid(), ps_parent(), ps_children(), ps_name(), ps_status(), ps_username(), ps_uids(), ps_gids(), ps_terminal(), ps_children() and the signal sending functions work properly for zombie processes. Other functions fail with "zombie_process" error.

Pid reuse

ps functions handle pid reuse as well as technically possible.

The query functions never return information about the wrong process, even if the process has finished and its process id was re-assigned.

On Windows, the process manipulation functions never manipulate the wrong process.

On POSIX systems, this is technically impossible, it is not possible to send a signal to a process without creating a race condition. In ps the time window of the race condition is very small, a few microseconds, and the process would need to finish, and the OS would need to reuse its pid within this time window to create problems. This is very unlikely to happen.

Recipes

In the spirit of psutil recipes.

Find process by name

Using ps() and dplyr:

## [[1]]
## <ps::ps_handle> PID=15618, NAME=R, AT=2018-11-07 09:38:47
## 
## [[2]]
## <ps::ps_handle> PID=15603, NAME=R, AT=2018-11-07 09:38:42
## 
## [[3]]
## <ps::ps_handle> PID=12255, NAME=R, AT=2018-11-06 16:01:03
## 
## [[4]]
## <ps::ps_handle> PID=3382, NAME=R, AT=2018-11-06 14:11:26
## 
## [[5]]
## <ps::ps_handle> PID=91047, NAME=R, AT=2018-11-06 12:09:09
## 
## [[6]]
## <ps::ps_handle> PID=71838, NAME=R, AT=2018-11-06 09:42:47
## 
## [[7]]
## <ps::ps_handle> PID=70692, NAME=R, AT=2018-11-06 09:42:20
## 
## [[8]]
## <ps::ps_handle> PID=69968, NAME=R, AT=2018-11-06 09:42:12
## 
## [[9]]
## <ps::ps_handle> PID=69526, NAME=R, AT=2018-11-06 09:42:03
## 
## [[10]]
## <ps::ps_handle> PID=69354, NAME=R, AT=2018-11-06 09:41:52
## 
## [[11]]
## <ps::ps_handle> PID=67847, NAME=R, AT=2018-11-06 00:45:09
## 
## [[12]]
## <ps::ps_handle> PID=66306, NAME=R, AT=2018-11-05 23:54:25
## 
## [[13]]
## <ps::ps_handle> PID=84819, NAME=R, AT=2018-11-05 16:05:40
## 
## [[14]]
## <ps::ps_handle> PID=84694, NAME=R, AT=2018-11-05 16:05:37

Without creating the full table of processes:

## [[1]]
## <ps::ps_handle> PID=3382, NAME=R, AT=2018-11-06 14:11:26
## 
## [[2]]
## <ps::ps_handle> PID=12255, NAME=R, AT=2018-11-06 16:01:03
## 
## [[3]]
## <ps::ps_handle> PID=15603, NAME=R, AT=2018-11-07 09:38:42
## 
## [[4]]
## <ps::ps_handle> PID=15618, NAME=R, AT=2018-11-07 09:38:47
## 
## [[5]]
## <ps::ps_handle> PID=66306, NAME=R, AT=2018-11-05 23:54:25
## 
## [[6]]
## <ps::ps_handle> PID=67847, NAME=R, AT=2018-11-06 00:45:09
## 
## [[7]]
## <ps::ps_handle> PID=69354, NAME=R, AT=2018-11-06 09:41:52
## 
## [[8]]
## <ps::ps_handle> PID=69526, NAME=R, AT=2018-11-06 09:42:03
## 
## [[9]]
## <ps::ps_handle> PID=69968, NAME=R, AT=2018-11-06 09:42:12
## 
## [[10]]
## <ps::ps_handle> PID=70692, NAME=R, AT=2018-11-06 09:42:20
## 
## [[11]]
## <ps::ps_handle> PID=71838, NAME=R, AT=2018-11-06 09:42:47
## 
## [[12]]
## <ps::ps_handle> PID=84694, NAME=R, AT=2018-11-05 16:05:37
## 
## [[13]]
## <ps::ps_handle> PID=84819, NAME=R, AT=2018-11-05 16:05:40
## 
## [[14]]
## <ps::ps_handle> PID=91047, NAME=R, AT=2018-11-06 12:09:09

Wait for a process to finish

On POSIX, there is no good way to wait for non-child processes to finish, so we need to write a sleep-wait loop to do it. (On Windows, and BSD systems, including macOS, there are better solutions.)

## [1] FALSE
## [1] TRUE

Kill process tree

This sends a signal, so it’ll only work on Unix. Use ps_kill() instead of ps_send_signal() on Windows.

## $gone
## $gone[[1]]
## <ps::ps_handle> PID=15631, NAME=???, AT=2018-11-07 09:38:51
## 
## $gone[[2]]
## <ps::ps_handle> PID=15632, NAME=???, AT=2018-11-07 09:38:51
## 
## $gone[[3]]
## <ps::ps_handle> PID=15635, NAME=???, AT=2018-11-07 09:38:53
## 
## $gone[[4]]
## <ps::ps_handle> PID=15636, NAME=???, AT=2018-11-07 09:38:53
## 
## $gone[[5]]
## <ps::ps_handle> PID=15637, NAME=???, AT=2018-11-07 09:38:53
## 
## 
## $alive
## list()

Terminate children

Note, that some R IDEs, including RStudio, run a multithreaded R process, and other threads may start processes as well. reap_children() will clean up all these as well, potentially causing the IDE to misbehave or crash.

## $gone
## $gone[[1]]
## <ps::ps_handle> PID=15638, NAME=???, AT=2018-11-07 09:38:53
## 
## $gone[[2]]
## <ps::ps_handle> PID=15639, NAME=???, AT=2018-11-07 09:38:53
## 
## $gone[[3]]
## <ps::ps_handle> PID=15640, NAME=???, AT=2018-11-07 09:38:53
## 
## 
## $alive
## list()

Filtering and sorting processes

Process name ending with “sh”:

ps() %>%
  filter(grepl("sh$", name))
## # A tibble: 44 x 11
##      pid  ppid name  username  status     user  system    rss    vms created             ps_handle 
##    <int> <int> <chr> <chr>     <chr>     <dbl>   <dbl>  <dbl>  <dbl> <dttm>              <I(list)> 
##  1 12161 12155 zsh   gaborcsa… running 0.00637 0.0125  1.16e6 2.53e9 2018-11-06 16:00:56 <S3: ps_h…
##  2 12155 12154 zsh   gaborcsa… running 0.296   0.186   2.23e6 2.53e9 2018-11-06 16:00:56 <S3: ps_h…
##  3  3590  3574 zsh   gaborcsa… running 0.00649 0.0113  1.16e6 2.53e9 2018-11-06 14:11:36 <S3: ps_h…
##  4  3574  3571 zsh   gaborcsa… running 0.267   0.126   2.22e6 2.53e9 2018-11-06 14:11:36 <S3: ps_h…
##  5  3337  3331 zsh   gaborcsa… running 0.00403 0.00615 1.16e6 2.53e9 2018-11-06 14:11:26 <S3: ps_h…
##  6  3331  3330 zsh   gaborcsa… running 0.230   0.109   2.22e6 2.53e9 2018-11-06 14:11:25 <S3: ps_h…
##  7  1283  1277 zsh   gaborcsa… running 0.0353  0.0771  1.16e6 2.53e9 2018-11-06 13:58:53 <S3: ps_h…
##  8  1277  1276 zsh   gaborcsa… running 0.749   0.411   2.23e6 2.53e9 2018-11-06 13:58:53 <S3: ps_h…
##  9 71793 71787 zsh   gaborcsa… running 0.00449 0.00680 1.16e6 2.53e9 2018-11-06 09:42:47 <S3: ps_h…
## 10 71787 71786 zsh   gaborcsa… running 0.229   0.111   2.22e6 2.53e9 2018-11-06 09:42:46 <S3: ps_h…
## # ... with 34 more rows

Processes owned by user:

ps() %>%
  filter(username == Sys.info()[["user"]]) %>%
  select(pid, name)
## # A tibble: 330 x 2
##      pid name                
##    <int> <chr>               
##  1 15618 R                   
##  2 15603 R                   
##  3 15404 quicklookd          
##  4 15402 Google Chrome Helper
##  5 15364 docker-compose      
##  6 15363 docker-compose      
##  7 15278 Google Chrome Helper
##  8 15274 Google Chrome Helper
##  9 15048 Google Chrome Helper
## 10 15046 Google Chrome Helper
## # ... with 320 more rows

Processes consuming more than 100MB of memory:

ps() %>%
  filter(rss > 100 * 1024 * 1024)
## # A tibble: 21 x 11
##      pid  ppid name     username  status   user system    rss    vms created             ps_handle 
##    <int> <int> <chr>    <chr>     <chr>   <dbl>  <dbl>  <dbl>  <dbl> <dttm>              <I(list)> 
##  1 15618 15603 R        gaborcsa… running  1.67  0.459 1.27e8 2.70e9 2018-11-07 09:38:47 <S3: ps_h…
##  2 15603 69354 R        gaborcsa… running  1.82  0.230 1.21e8 2.69e9 2018-11-07 09:38:42 <S3: ps_h…
##  3 15274  9730 Google … gaborcsa… running 14.1   2.02  2.26e8 3.22e9 2018-11-07 09:33:43 <S3: ps_h…
##  4 15048  9730 Google … gaborcsa… running  1.93  0.328 1.25e8 3.06e9 2018-11-07 09:05:18 <S3: ps_h…
##  5 14908 14899 Slack H… gaborcsa… running 47.1   5.88  5.38e8 4.18e9 2018-11-07 08:07:41 <S3: ps_h…
##  6 14907 14899 Slack H… gaborcsa… running  9.06  2.02  1.35e8 3.63e9 2018-11-07 08:07:40 <S3: ps_h…
##  7 14905 14899 Slack H… gaborcsa… running  9.21  0.865 2.95e8 3.83e9 2018-11-07 08:07:40 <S3: ps_h…
##  8 14904 14899 Slack H… gaborcsa… running  8.11  0.761 3.01e8 3.83e9 2018-11-07 08:07:40 <S3: ps_h…
##  9 14903 14899 Slack H… gaborcsa… running 19.5   2.88  3.93e8 3.98e9 2018-11-07 08:07:40 <S3: ps_h…
## 10 14900 14899 Slack H… gaborcsa… running 12.8  13.3   3.45e8 3.38e9 2018-11-07 08:07:39 <S3: ps_h…
## # ... with 11 more rows

Top 3 memory consuming processes:

ps() %>%
  top_n(3, rss) %>%
  arrange(desc(rss))
## # A tibble: 3 x 11
##     pid  ppid name     username  status    user system    rss    vms created             ps_handle 
##   <int> <int> <chr>    <chr>     <chr>    <dbl>  <dbl>  <dbl>  <dbl> <dttm>              <I(list)> 
## 1 64244     1 iTerm2   gaborcsa… running 2.93e4 1.90e3 6.71e8 5.31e9 2018-10-30 08:48:09 <S3: ps_h…
## 2 14908 14899 Slack H… gaborcsa… running 4.71e1 5.88e0 5.38e8 4.18e9 2018-11-07 08:07:41 <S3: ps_h…
## 3 67519  9730 Google … gaborcsa… running 6.83e2 8.54e1 5.19e8 3.91e9 2018-11-06 00:23:50 <S3: ps_h…

Top 3 processes which consumed the most CPU time:

ps() %>%
  mutate(cpu_time = user + system) %>%
  top_n(3, cpu_time) %>%
  arrange(desc(cpu_time)) %>%
  select(pid, name, cpu_time)
## # A tibble: 3 x 3
##     pid name                cpu_time
##   <int> <chr>                  <dbl>
## 1   516 com.docker.hyperkit  130929.
## 2 64244 iTerm2                31230.
## 3  9730 Google Chrome         22743.

Contributions

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

License

BSD © RStudio