====== Learning multigrain diffraction with a synthetic dataset =====

Multigrain diffraction data processing is tricky, you things can go seriously wrong if you do not understand the various parameters and steps in the data processing workflow.

A good strategy is to generate a synthetic dataset, as close as possible as the dataset you will have in you experiment. Generate a dataset from 500 grains if you expect 500 grains. Then, try to go through the data processing strategy and see if you do index the grains that were generated.

===== Workflow outline =====

Here is a simple outline of what you should do:
  - [[processing:synthetic_dataset|generate a synthetic dataset]], this includes creating an input file (//.inp// extension) for [[software:polyxsim|PolyXSim]] and run the simulation. It should contain information on the experiment instruments, the number of grains, the sample shape, strain, background, peak shape, and so on. This simulation may take a while, especially if you start playing with things such as peak broadening...
  - have a look at the generated diffraction images with [[software:fabian|Fabian]]
  - [[processing:search_for_peaks|search for peaks]] using [[software:peaksearch|PeakSearch]]
  - adjust your experimental parameters and [[processing:compute_gvectors|evaluate g-vectors]] in [[software:imaged11|ImageD11]]
  - index your list of extracted g-vectors, using [[software:grainspotter|GrainSpotter]], for instance.
  - [[processing:compare-grain-orientations|compare the indexed grains]] with those generated at the beginning of the procedure.

===== Details =====

<WRAP center round tip 80%>
After following this section, you should understand the parameters involved with multigrain diffraction data processing. You should train with virtual data, assigning rotation ranges, steps, experimental geometry, and crystal structures similar to those of your experiments. At the end, you should be able to re-index 80 to 90% of the starting grains, with very little erroneous indexings.
</WRAP>

==== Producing data by simulation ====

This step will [[processing:synthetic_dataset|simulate the outcome of a the experiment]], with a given number of grains, distribution of orientations, strains, etc.

The simulation will not only provide simulated 2D diffraction images but also a list grains and their G-vectors in [[processing:synthetic_dataset#output_files|various formats]]. The amount of diffraction images depends on the ω range and the step size you put choose. For example, an ω range from -28° to +28° with a step size of 0.5° produces 112 images numbered from 0 to 111. 

Typically, at the end of the simulation, we work with
  * the generated diffraction images, which we will try to process,
  * the generated list of grains (in //.gff// format), which we will compare to our indexing.

==== Evaluating the simulated diffraction images ====

You should look at the simulated diffraction images with [[software:fabian|Fabian]]. The goal of this step is to
  * locate diffraction peaks,
  * evaluate their intensity and that of the surrounding background,
  * understand the [[dac_experiments:geometry|concept of the O-Matrix]].

[[software:fabian|Fabian]] allows you to zoom-in on a peak as well as browse through diffraction images as you increase or decrease ω.

You will be able to evaluate potential issues with peak overlap. How much rotation in ω can you do before you find an other peak? What is the η-range in which you can safely assign this peak and not its neighbor?

==== Working on background ====

To get rid of the background we now add up all the diffraction images and calculate an average and a median image. Then, every image is subtracted by this average/median image which should remove the background. Typically, with simulated data, background will not strongly affect your peak search. This point, however, is a good step to perform [[processing:backgroundgeneral|median and average images]] calculations. But in case you have real data, this procedure is essential!

The average image is a representation of the data that includes
  * the background,
  * the diffraction from the //surrounding matrix// or the //powder portion of the sample// that gives you continuous diffraction rings in the image,
  * the diffraction from the //sample grains//, that give rise to well-defined diffraction spots.

The median image is a representation of the data that includes
  * the background,
  * the diffraction from the //surrounding matrix// or the //powder portion of the sample// that gives you continuous diffraction rings in the image.
The diffraction from the //sample grains//, that give rise to well-defined diffraction spots are removed and **//do not contribute//** to the median image!

In [[software:fabian|Fabian]], you can subtract the median image from the original data. You will see the spots from the sample grains only. In theory, all background and contribution from the surrounding sample matrix should be removed.

==== Peak extraction ====

At this point, you are ready with [[processing:search_for_peaks|peak extraction]]. You should test the effect of different thresholds.

Typically, at this step, you will provide the name of the median image. The median image will be subtracted from the data and the thresholds you defined are relative intensities of the peaks, relative to that of the background and contribution of the surrounding sample matrix.

Evaluate the outcome of the peak search by loading the peaks which were found into [[software:fabian|Fabian]] and see if they match the actual peak positions. 

==== Evaluate g-vectors ====

The next step in the process is the [[processing:compute_gvectors|calculations of g-vectors]]. For a given reflection in the crystal, **G**<sub>hkl</sub> is perpendicular to the diffracting plane (hkl) and its norm is 2π/d<sub>hkl</sub>. The real space coordinate of **G**<sub>hkl</sub> can be calculated from the experimental measurements, namely the x-ray wavelength λ, the diffraction angle 2θ, the azimuth angle η, and the rotation angle ω.

In order to do so, you need to precisely evaluate your experimental geometry (beam center, detector distance, detector tilt, etc). Once this is done, g-vectors can be calculated directly from the location peaks extracted from the diffraction images.

Follow the procedure described in the [[processing:compute_gvectors|compute G-vectors]] section. At the end of the procedure, you will save the list of experimentally detected g-vectors in a text [[fileformat:gve|gve]] file.

==== Grain indexing ====

To index the grains you need [[software:grainspotter|GrainSpotter]] and an //.ini// file. If you previously did a simulation with [[software:polyxsim|PolyXSim]], you already have an //.ini// file which you can modify for your purposes. Make sure to keep the original and do only modify a copy. For details on what this //.ini// file should contain, check out the [[software:grainspotter|GrainSpotter wiki page]], the [[fileformat:ini|.ini wiki page]] or the [[GrainSpotter manual]]. Make sure the //.ini// file contains the right //.gve// file (the one you just created).

The outcome of the GrainSpotter algorithm is three files: a //.gff// file, a //.ubi// file and a //.log// file. These files contain information on the amount of grains it found, their UBi matrices and some more info. If you are already working with real data, you can now interpret what you got.

==== Check your workflow ====

If you did a simulation in advance, this is the time to check if you (and the software) did a good job or not. Open the //.gve// file which was just created by GrainSpotter and compare the g-vectors with the ones which were created by the simulation at the very beginning. The UBi matrices can be in a different order but should be the same. Remember that some rows or columns within the matrix can be inverted due to symmetry.

<WRAP center round box 60%>
**Example**: The following two matrices are created by PolyXSim (left) and by GrainSpotter (right). The symmetry of the material is tetragonal. This means that //a-axis// and //b-axis// are identical and cannot be distinguished by the software. So row 1 and row 2 are exchangeable. In addition to that, their sign is opposite. But the software cannot distinguish the polarity of the grain either. So based on this the two UBi matrices are identical.

  From PolyXSim             From GrainSpotter
   3.582  2.186 -0.098       4.411  0.360  1.912
  -4.411 -0.360 -1.912      -3.582 -2.186  0.098
   0.133  1.995  0.247       0.133  1.995  0.247
</WRAP>

If all the simulated UBi matrices match the calculated ones, you can be quite sure that your workflow is running properly. In a next step you can work with real data.