User Tools

Site Tools


processing:workflow_training

This is an old revision of the document!


Learning multigrain diffraction with a synthetic dataset

Multigrain diffraction data processing is tricky, you things can go seriously wrong if you do not understand the various parameters and steps in the data processing workflow.

A good strategy is to generate a synthetic dataset, as close as possible as the dataset you will have in you experiment. Generate a dataset from 500 grains if you expect 500 grains. Then, try to go through the data processing strategy and see if you do index the grains that were generated.

Workflow outline

Here is a simple outline of what you should do:

  1. generate a synthetic dataset, this includes creating an input file (.inp extension) for PolyXSim and run the simulation. It should contain information on the experiment instruments, the number of grains, the sample shape, strain, background, peak shape, and so on. This simulation may take a while, especially if you start playing with things such as peak broadening…
  2. have a look at the generated diffraction images with Fabian
  3. adjust your experimental parameters and evaluate g-vectors in ImageD11
  4. index your list of extracted g-vectors, using GrainSpotter, for instance.
  5. compare the indexed grains with those generated at the beginning of the procedure.

Details

After following this section, you should understand the parameters involved with multigrain diffraction data processing. You should train with virtual data, assigning rotation ranges, steps, experimental geometry, and crystal structures similar to those of your experiments. At the end, you should be able to re-index 80 to 90% of the starting grains, with very little erroneous indexings.

Producing data by simulation

This step will simulate the outcome of a the experiment, with a given number of grains, distribution of orientations, strains, etc.

The simulation will not only provide simulated 2D diffraction images but also a list grains and their G-vectors in various formats. The amount of diffraction images depends on the ω range and the step size you put choose. For example, an ω range from -28° to +28° with a step size of 0.5° produces 112 images numbered from 0 to 111.

Typically, at the end of the simulation, we work with

  • the generated diffraction images, which we will try to process,
  • the generated list of grains (in gff format), which we will compare to our indexing.
Evaluating the simulated diffraction images

You should look at the simulated diffraction images with Fabian. The goal of this step is to

  • locate diffraction peaks,
  • evaluate their intensity and that of the surrounding background,
  • understand the concept of the O-Matrix.

Fabian allows you to zoom-in on a peak as well as browse through diffraction images as you increase or decrease ω.

You will be able to evaluate potential issues with peak overlap. How much rotation in ω can you do before you find an other peak? What is the η-range in which you can safely assign this peak and not its neighbor?

Working on background

Typically, with simulate data, background will not strongly affect your peak search. This point, however, is a good step to perform median and average images calculations.

The average image is a representation of the data that includes

  • the background,
  • the diffraction from the surrounding matrix or the powder portion of the sample that gives you continuous diffraction rings in the image,
  • the diffraction from the sample grains, that give rise to well-defined diffraction spots.

In Fabian, you can subtract the median image from the original data. You will see the spots from the sample grains only. In theory, all background and contribution from the surrounding matrix should be removed.

Create an input file with the ending .inp. For a start, simply modify an existing one like this. Afterwards, you can run the simulation with PolyXSim. Write the following to the Konsole:

PolyXSim.py -i 'some_input_file'.inp

The 7 different files (which were just mentioned above) are usually created quite fast. The time consuming process is the creation of images. This time highly depends on the parameters you put in the input file, e.g. the amount of grains, the peak shape and if you switched on strain tensors or noise. If you just want to test if the software is working it is wise to use an input file with very simple parameters (only 1 grain, no strain tensors, no noise, small ω range etc.).

While the simulation is running you can already look at the images, which are already created. For this, open a new tab in the Konsole and open Fabian:

fabian.py

This is convenient because you can already see at this point if your simulation works. And in case it does not, you can stop the simulation process right now and you don't need to wait until all images are created, which can take very long time. While you're at it, check also the O-matrix. You find it in Fabian under Image –> Orientation. Choose the one which is the same as in your input file.

Working on background

To get rid of the background we now add up all the diffraction images and calculate an average and a median image. Then, every image is subtracted by this average/median image which should remove the background. Of course, if you switched off the background in the previous simulation this process won't change anything. But in case you have real data, this procedure is essential!

For calculating the average and median you use the python script median.py. An alternative way is Image Math. For median.py type to the command line:

median.py -h

The help will pop up and tell you how to use it.

The calculation will create an additional .edf file.

Next, the actual images have to be subtracted by one of these three images. Usually the m2 image (median) is used for this, because it is less affected by outliers. Before you do this, make sure you have a separate folder to avoid mixing up the actual data with the processed data! Raw data should never be modified!

Look at the images in Fabian, go to Image –> Correction –> Subtract background and choose the m2 image. Now every image which is currently loaded (including those which you can access by clicking on next and previous) gets subtracted by this m2 image (median). If it is not simulated data without noise etc. you should see a difference. The peaks should appear clearer.

Peak extraction

From these processed images you can now extract the peaks. Look at some random peaks from several images by zooming in (in Fabian) and check out their intensity. Try to estimate a threshold value which defines how intense a peak must be to be seen by the algorithm. Try to define a threshold, which separates peaks from background (everything above the threshold value is a peak, everything below is background). If you are not sure you can also define several threshold values.

When you defined one (or more) threshold(s) you can start the PeakSearch algorithm:

peaksearch.py -n ../'directory'/'name stem' -f 'first image number' -l 'last image number' -d ../'directory'/'median.edf file' -t 'threshold value 1' -t 'threshold value 2' ...

To check the outcome of PeakSearch, you can load the peaks, which were found, into Fabian and see if they match the actual peak positions. To do this, you have to go click on CrystTools –> Peaks –> Read peaks and choose the .flt file which PeakSearch just created. They should appear as red circles on the diffraction image. You can switch on/off the diffraction spots by clicking on CrystTools –> Peaks –> Show.

Experimental parameters

From these peaks you can now fit the experimental parameters. To do this, open ImageD11 by typing the following to the Konsole:

ImageD11_gui.py

To load the PeakSearch file click on Transformation –> Load filtered peaks and choose the .flt file from the separate folder with the processed data. Although the image is loaded, it is not plotted automatically, because there are two different ways of plotting. One plotting option is the 2D diffraction image which is similar to Fabian (y/z plot). The other possibility is a cake plot (tth/eta plot). Both options can be accessed by clicking on Transformation. Note that plotting both options at once is not making sense because the software is using the same scale for both images (which makes it look weird). To switch from one plot to the other just click on the Clear button (bottom of the window) and then plot the other one. Clear does only erase the plot, the data is still there.

Before you check the plots you should enter the measurement parameters. Go to Transformation –> Edit parameters and enter all parameters for your sample. Some of them can be found in the calbration files of the beamline (such as the poni file). Remove all check marks from the vary boxes and press Ok.

Next you can have a look at the tth/eta plot. Most of the peaks should appear to be on imaginary vertical lines. Zoom in and check, if these lines are completely vertical. If not, you might have strain in your sample. If the line looks like a sinus curve of exactly one period this is due to a wrong beam center. To fix this, go back to Edit parameters and activate the check marks for the y-position and z-position of the detector. Press Ok and click on Fit for several times until the spots don't move anymore. The imaginary lines should now be completely straight (if you don't have strain). If they are not, you can try to fit other parameters.

At some point you can click on Transformation –> Add unit cell peaks. Red tick marks will appear which indicate the expected positions of the vertical lines. With this you can check whether your input parameters (cell parameters, detector distance, …) were correct.

Grain indexing

This step is necessary to get the G-vectors from your grains.

In ImageD11, click on Indexing –> Assign peaks to powder rings (nothing will happen), then click on Transformation –> Compute g-vectors and finally Transformation –> Save g-vectors. Make sure the file gets the ending .gve.

To index the grains you need GrainSpotter and an .ini file. If you previously did a simulation with PolyXSim, you already have an .ini file which you can modify for your purposes. Make sure to keep the original and do only modify a copy. For details on what this .ini file should contain, check out the GrainSpotter wiki page, the .ini wiki page or the GrainSpotter manual. Make sure the .ini file contains the right .gve file (the one you just created).

To start GrainSpotter, type the following to the Konsole:

GrainSpotter.0.90 'some_file_name'.ini
or
grainspotter 'some_file_name'.ini

For more information on which syntax you should use, check the GrainSpotter wiki page.

The outcome of the GrainSpotter algorithm is three files: a .gff file, a .ubi file and a .log file. These files contain information on the amount of grains it found, their UBi matrices and some more info. If you are already working with real data, you can now interpret what you got.

Check your workflow

If you did a simulation in advance, this is the time to check if you (and the software) did a good job or not. Open the .gve file which was just created by GrainSpotter and compare the g-vectors with the ones which were created by the simulation at the very beginning. The UBi matrices can be in a different order but should be the same. Remember that some rows or columns within the matrix can be inverted due to symmetry.

Example: The following two matrices are created by PolyXSim (left) and by GrainSpotter (right). The symmetry of the material is tetragonal. This means that a-axis and b-axis are identical and cannot be distinguished by the software. So row 1 and row 2 are exchangeable. In addition to that, their sign is opposite. But the software cannot distinguish the polarity of the grain either. So based on this the two UBi matrices are identical.

From PolyXSim             From GrainSpotter
 3.582  2.186 -0.098       4.411  0.360  1.912
-4.411 -0.360 -1.912      -3.582 -2.186  0.098
 0.133  1.995  0.247       0.133  1.995  0.247

If all the simulated UBi matrices match the calculated ones, you can be quite sure that your workflow is running properly. In a next step you can work with real data.

processing/workflow_training.1550574953.txt.gz · Last modified: 2019/02/19 11:15 by smerkel