Searching large source trees in an efficient way on Linux


Here is the alias:

alias search 'find \!:1 -noleaf -type f -not -path "*/boost/*" -not -path "*/extensions/*" -print0 | xargs -0 -n 100 -P 8 grep -I --color -H -n \!:2*'


How do I use it?

Here is how I use it:

search [dir] [term] [grep_options]
search ./src/ the\ search\ term
search ./src/ keyTerm -A5 -B5

How does it work?


This search alias uses find as follows to locate all files under the provided directory (i.e. first argument) while excluding directories that we don’t care about:

find \!:1 -noleaf -type f -not -path "*/boost/*" -not -path "*/extensions/*" -print0

For aliases remember this:

!* is all but the first
!:0 is only the first, the command itself
!:1 is only the first argument
!:2* is all but the first argument
!$ is only the last argument
!:1- is all but the last argument
!! is all
$0 is the shell
$# is the number of args
$$ is the process id (PID)
$! is the PID of the previous command
$? is the return code from the previous command

Thus, the “\!:1” means only the first argument, and the bang (!) has to be escaped.


The “-noleaf” is used because I am normally working on Windows/NTFS mounts and it is not safe to assume that directories containing 2 fewer subdirectories than their hard link count only contain files.


We only want to gather files for searching so I use the “-type f”.

-type f

I normally have very large directories which I do not care to search in, so I specify:

-not -path "*/boost/*" -not -path "*/extensions/*"

Finally for the find command I pass “-print0” which returns null (instead of new line) terminated strings. This adds support for paths with spaces in them:



The xargs command controls how many files are being passed into grep and it is handling running them in parallel.

xargs -0 -n 100 -P 8 grep -I --color -H -n \!:2*

The “-0” option is used here to tell xargs that the strings coming in are null terminated (this adds support for files with spaces):


The “-n 100” and the “-P 8” options are where the speed and power of this alias come from. The “-n 100” is telling xargs to pass 100 files from find into grep at a time. The “-P 8” is telling xargs to run 8 grep commands in parallel.

This means that if we have a source tree of 1600 files, then grep will be called 16 times and each will be passed 100 files. The best part is that 8 of those grep commands will be running in parallel each on 100 files, so the command finishes as if there were only two (2) grep invocations – very fast even on large source trees:

-n 100 -P 8


The grep command is used to do the actual searching in files.

grep -I --color -H -n \!:2*

The “-I” option ignores binary files:


Colored results make it much easier to see hits:


Because we are passing in the files to grep it may not show the file name where the hit occurred so we add “-H” to print the file name:


The line number is also important, so we add “-n”:


The ability to control grep is handled with an arguments wildcard. Here the “\!:2*” means the second and all subsequent arguments passed into the search alias. Thus the grep search term and all other grep options can be specified after the directory to search:


The final piece is that the xargs command will add the files from the find command to the grep command. It will add 100 (or less if there are less than 100) files to every grep command and each of those will be run in parallel with up to 8 running at any given time.

Enjoy your searching.


Catching a double free or corruption error with memcheck (a Valgrind tool)

I was randomly getting errors (1 run in 50 would reproduce) like:

$ ./myprogram
*** glibc detected *** double free or corruption (out): 0x093014a4 ***

Linux has a randomization of virtual address space which is supposed to help thwart buffer overflow attacks etc.

This can cause errors to randomly not show, so in the spirit of trying to consistently reproduce the problem I disabled this using:

$ setarch x86_64 -R ./myprogram

This didn’t seem to help.

$ valgrind --tool=memcheck ./myprogram

The output of this produced a:

Invalid free which showed where the error was.

How to generate a ctags files and use it with vim

The ctags command allows you to index source – any source. The command generates a single file called a tags file (which [ironically] is the name of the file). Then you can point to this file with editors like VIM/GVIM/EMACS, etc. for auto-complete and for cross-probing.

Let’s see it in action…

First tag the source you want to index using:

ctags -V -R --c++-kinds=+p --fields=+iaS --extra=+q --languages=c++ .

This will recursively index all files from this point down and generate a ‘tags’ file in the current directory. You should open the tags file and look around to see if it’s getting what you expect and that it isn’t getting things that you don’t expect.

From here we can open vim and start auto-completing. This works because VIM will automatically search the current directory for a tags file, which we just generated. If you want to point to a tags file that lives in another directory, then use:

:set ctags=/path/to/file/with/tags

Now you can start typing. Let’s act like we have a class called MyClass, in VIM you must be in insert mode (press ‘i’) then type…


…and a drop-down menu should show with the option to auto-complete to MyClass (presuming some MyClass.cpp was indexed with the ctags command).

CTRL+N : to go forward through tags

CTRL+P : to go backward through tags

Now the cool part. Once you’ve auto-completed a word or you put your cursor on any keyword and press CTRL+], then you will see a menu that will let you pick the implementation you would like to go to and once selected VIM will open the source file that defines that keyword.

CTRL+] : to push into tags

CTRL+T : to pop out of keywords


How can I run something in one thread and wait for the result in a different thread using Java/JavaFX?


The code is as simple as:

import java.util.concurrent.CountDownLatch;

final CountDownLatch latchToWaitForJavaFx = new CountDownLatch(1);

Platform.runLater(() -> {



Hopefully you would rarely need to synchronize two threads. This is normally an indicator of a poor design – ideally each thread would be able to operate independently. However, I have found a common need to synchronize between threads when testing JavaFX GUIs using JUnit.

For instance, you may want to invoke some GUI behavior, and then check if the GUI was updated. For these cases, you will need to find a way to block the testing thread until the GUI action has completed, this can be done using a CountDownLatch as follows:

import java.util.concurrent.CountDownLatch;
public void showAnchorPane_shouldFocusTextField() throws Exception {
  // Note: We are in the testing thread.  

  // Here we create the count down latch which is thread safe:
  final CountDownLatch latchToWaitForJavaFx = new CountDownLatch(1);

  // We now create a runnable and register that runnable to be called
  // by the JavaFX thread using the Platform.runLater method. This call
  // is asynchronous and returns immediately:
  Platform.runLater(new Runnable() {
    public void run() {
      // Do required work on JavaFX thread...
      Stage stage = new Stage();
      Scene scene = new Scene(anchorPane);

// If we threw before this line, while in the JavaFX thread, then the
// calling thread would never know!

      // Now the work is done we release the testing thread:

  // Because the last call returned immediately, we wait here until the
  // JavaFX thread has completed it's work for this test:

  // Now check if GUI is open and that the right object has focus, etc.

Let’s refactor to a function:

private void runAndWaitOnJavaFx(Runnable guiWorker) throws Throwable {
  final CountDownLatch latchToWaitForJavaFx = new CountDownLatch(1);
  Platform.runLater(() -> {;
// If we threw before this line, while in the JavaFX thread, then the
// calling thread never would know!

To deal with the  strange behavior if the runnable were to throw and the parent thread wouldn’t know…

private void runAndWaitOnJavaFx(Runnable guiWorker) throws Throwable {
  final CountDownLatch latchToWaitForJavaFx = new CountDownLatch(1);
  final Throwable[] javaFxException = {null};
  Platform.runLater(() -> {
    try {;
    } catch (Throwable e) {
      javaFxException[0] = e;
    } finally {
  if (javaFxException[0] != null) {
    throw javaFxException[0];

How can I easily access my Linux command history?

One of the fastest ways to search your previous commands is to use CTRL+R and start typing, once you’ve entered enough text you can use CTRL+R again and again to search your history for matches.

Let’s assume we execute the following commands:

$ echo dog
$ echo cat
$ echo hotdog

Now press CTRL+R and you will see a new “bck:” prompt at the bottom:


Now if you type “dog” you will see the last command that had that string anywhere in it, populated on the previous command prompt:

$ echo hotdog

Pressing CTRL+R again will cycle through the history:

$ echo dog

You can press ENTER to execute the command, or CTRL+E to go to the end of the command without executing it.

I also hear you can use CTRL+S to go backwords through the search results, but that never works for me – I believe my terminal or window manager is swallowing the CTRL+S.


How do you SSH without a password and what permissions are needed on the .ssh files?

SSH stands for Secure Shell and it is a protocol which enables secure network service connections over unsecured networks. For most users they think of SSH as a way to remotely connect to and control another machine. Every time you establish a new connection you will need to authenticate. This is generally done using a password, but a secure key can be used instead. This article talks about using a key so you don’t have to enter a password and the default .ssh permissions are listed out here too – these are the permissions you should use on these files for security reasons.


Home Directory Group

I have encountered problems where I switched groups in my company and the machines I would connect to were expecting a different user group permissions on my home drive.  Because I didn’t have the correct group on my home directory SSH daemons on these machines that were owned by the new group were unable to read my ~/.ssh/ files.

You can test this by changing your home directory permissions to 755. Note: Most IT departments will want your home directory locked down (700), so this might not be something you want to do.

If you know someone else who is able to SSH without a password, then see what group is set on their home. You can change this with chown, e.g.

$ ls -ld ~
drwx------ 109 me old 196608 Jan 24 17:38 .
$ chown me:new ~
$ ls -ld ~
drwx------ 109 me new 196608 Jan 24 17:38 .

Preserve Old .ssh Folder

If you don’t have a ~/.ssh folder, then you don’t need to worry about this step.

Preserve your old .ssh folder by moving it to a new name. This prevents us from losing any important keys that we might find out later we were using.

mv ~/.ssh{,.old}

This will move ~/.ssh to ~/.ssh.old.

Create the Key

Now let’s create our key. The SSH tools, provided on most Linux machines, include a command to generate keys: ssh-keygen

You can tell this command what type of key to generate with -t, e.g. ssh-keygen -t rsa

For our use case, we want an RSA key which is the default, so I won’t specify one here. We also don’t want a password, but if you did you could enter one.

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/me/.ssh/id_rsa): [ENTER]
Created directory '/home/me/.ssh'.
Enter passphrase (empty for no passphrase): [ENTER]
Enter same passphrase again: [ENTER]
Your identification has been saved in /home/me/.ssh/id_rsa.
Your public key has been saved in /home/me/.ssh/
The key fingerprint is:
9a:f0:d2:de:dd:1a:69:b1:ee:c1:a3:f1:4c:7f:07:51 me@myserver
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                E|
|               . |
|              .  |
|   . S  .      . |
|  +  o  .  +   . |
|       . = . X . |
| o  .  .  @  =  o|
|      . . o.B ...|

Let’s see what was created and also take note of the current permissions:

$ ls -la ~/.ssh
total 208
drwx------   2 me grp   4096 Jan 24 10:31 .
drwxr-xr-x 110 me grp 196608 Jan 24 10:31 ..
-rw-------   1 me grp   1675 Jan 24 10:31 id_rsa
-rw-r-----   1 me grp    401 Jan 24 10:31

Looks like we’d expect, right. The ~/.ssh/id_rsa (private key) is only readable by me, and the ~/.ssh/ (public key) is readable by my group.

Share Public Key

In order to SSH into a machine you will need that machine to provide your key with access. This is accomplished by adding your public key to the ~/.ssh/authorized_keys file.

Notice: The ~/.ssh/authorized_keys file I’m referring to here is on the machine you want to connect to – not the machine you are currently on. This could even be under a different user.

This can even work for root at /root/.ssh/authorized_keys.

Regardless of the target machine configuration, if it has SSH installed and a daemon running (which most will), then you can send your key to this machine with the ssh-copy-id command:

$ ssh-copy-id mymachine
The authenticity of host 'mymachine(' can't be established.
RSA key fingerprint is 17:62:db:8d:c6:84:fa:6a:84:8d:c6:19:31:75:1a:fd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'mymachine,' (RSA) to the list of known hosts.
me@mymachine's password:
Now try logging into the machine, with "ssh 'mymachine'", and check in:


to make sure we haven't added extra keys that you weren't expecting.

Let’s see what was created and also take note of the current permissions:

$ ls -la ~/.ssh
total 208
drwx------   2 me grp   4096 Jan 24 10:31 .
drwxr-xr-x 110 me grp 196608 Jan 24 10:31 ..
-rw-------   1 me grp    432 Jan 24 11:06 authorized_keys
-rw-------   1 me grp   1675 Jan 24 10:31 id_rsa
-rw-r-----   1 me grp    401 Jan 24 10:31
-rw-r--r--   1 me grp    408 Jan 24 11:02 known_hosts

Notice that we now have two more files. The authorized_keys file which we mentioned and expected from earlier, but the known_hosts we have not talked about and that is also new – we will talk more about that later.

If your home directory is the same on all machines

This becomes really nice if you connect to several machine which all share your same home directory. This means that you can put your own public key into your own authorized_keys file and then every machine you connect to on the network will let you connect without a password.

If you have the same home drive on the machine(s) you want to connect to, then instead of using ssh-copy-id, you can just add the public key to the authorized_keys file:

cat ~/.ssh/ &gt;&gt; ~/.ssh/authorized_keys

The known_hosts file

The known_hosts hosts file keeps a list of finger prints from machines you connect to. The first time you connect you will see this:

The authenticity of host 'mymachine(' can't be established.
RSA key fingerprint is 17:62:db:8d:c6:84:fa:6a:84:8d:c6:19:31:75:1a:fd.
Are you sure you want to continue connecting (yes/no)?

If you type yes, then every time after this first connection you will not see this message. This mechanism is in place to protect you in case an attacker added a machine with this host name to the network you could connect to the correct host name, but get the wrong machine.

If this happens then you will see a message like this:

$ ssh mymachine
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
Please contact your system administrator.
Add correct host key in /home/me/.ssh/known_hosts to get rid of this message.
Offending key in /home/me/.ssh/known_hosts:1
RSA host key for mymachine has changed and you have requested strict checking.
Host key verification failed.

Have fun sshing!

How do you chain commands in BASH or CSH?

I often need to run several commands which can take an hour or more and I won’t necessarily be present the entire time. It is nice to be able to run a command, and if it is successful, then run another; However, if one command fails, then stop the flow.  This article will tell you how to do this on the command line.

Chaining On Success

Perhaps you want to build, and then test. It would be annoying if the tests ran even if the build failed.

To chain the commands so subsequent commands will only run if the preceding command was successful (returned a zero):

make && run_test && echo "SUCCESS"

Chaining On Failure

To run a command followed by another command, only if the first command failed, then use:

make || echo "BUILD FAILED!"

You can also combine them:

(sh -c "exit 0" && sh -c "exit 1") && echo "SUCCESS" || echo "FAIL"

With Email

This becomes very powerful if you send yourself an email with the results so you know when a run completes and the result, e.g.

(sh -c "exit 0" && sh -c "exit 1") && mailx -s "Build: SUCCESS" $USER < /dev/null || mailx -s "Build: FAIL" $USER < /dev/null

Chaining Regardless

To run a command followed by another command, regardless of the return code, then use:

make; echo "BUILD DONE - make returned code: $?"

How does this work?

It is easiest to think of this all simply as Boolean logic. Boolean comparisons will only evaluate until the result can be determined.

On Linux a return code from a command of “0” means success. All other return codes are regarded as failures. This can be confusing because normal Boolean logic uses a 1 for true, e.g.

1 && 1 == 1, Evaluates the first and second expression
1 && 0 == 0, Evaluates the first and second expression
0 && 1 == 0, Only evaluates the first expression
0 && 0 == 0, Only evaluates the first expression

However, for our return codes this looks like:

exit 0 && exit 0 == Success, Executes the first and second command
exit 0 && exit 1 == Fail, Executes the first and second command
exit 1 && exit 0 == Fail, Only executes the first command
exit 1 && exit 1 == Fail, Only executes the first command

We can use a similar example for OR:

1 || 1 == 1, Only evaluates the first expression
1 || 0 == 1, Only evaluates the first expression
0 || 1 == 1, Evaluates the first and second expression
0 || 0 == 0, Evaluates the first and second expression

With return codes this looks like:

exit 0 || exit 0 == Success, Only executes the first command
exit 0 || exit 1 == Success, Only executes the first command
exit 1 || exit 0 == Success, Executes the first and second command
exit 1 || exit 1 == Fail, Executes the first and second command