Graphics, Figures & Tablespgfplots : How to add a linear regression line?

Information and discussion about graphics, figures & tables in LaTeX documents.
User avatar
Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

I'm studying the pgfplots package to make some cool graphics using experimental data. I usually use Excel for this kind of stuff, but I'm interested in doing the same directly with LaTeX.

The trouble with most LaTeX documentations is that they're extremely heavy, and their examples are too often "out of this world". I'm having difficulties in finding the proper setup to add a linear regression line, and to customize the markers, and the tick numbers. Here's a MWE to work with :

Code: Select all

\documentclass[11pt,letterpaper,twoside]{article}
\usepackage[total={6.5in,10in},left=1in,top=0.5in,includehead,includefoot]{geometry}
\usepackage{pgfplots}
\pgfplotsset{compat=newest}

\begin{document}

Some cute data graphic :

\begin{center}
	\begin{tikzpicture}
	\begin{axis}[
		height=9cm,
		width=9cm,
		grid=both,
		tick align=inside,
		minor tick num=3,
		major tick style={black,thin},
		minor tick style={black},
		major grid style={color=gray!60,densely dashed},
		minor grid style={color=gray!50,densely dotted},
		tick label style={font=\footnotesize},
		label style={font=\normalsize},
		xlabel=$X$,
		ylabel=$Y$,
		title={Some title},
		xtick={-1, 0, 1, 2, 3, 4, 5},
		ytick={-1, 0, 1, 2, 3, 4, 5, 6}
	]
	\addplot[blue,mark=*,mark size=1.10,only marks,error bars/.cd,x dir=both,y dir=both,x explicit,y explicit]
	coordinates {
		(0, 0) +- (0.1, 0.4)
		(0.5, 1) +- (0.4, 0.2)
		(1, 2) +- (0.2, 0.4)
		(1.55, 3.25) +- (0.15, 0.3)
		(2, 5) +- (0.3, 0.3)};
	\end{axis}
	\end{tikzpicture}
\end{center}

\end{document}
Here's a preview, with a few things shown in red that I want to add in the plot :
graphic.jpg
graphic.jpg (30.4 KiB) Viewed 16111 times
1. How to modify the code above to add a linear regression line, with options for its style (color and thickness)?

2. How to show its equation (with many digits)? and to change its "X" and "Y" symbols? (so the equation is using exactly the same symbols as on the axis).

3. How to add more digits to the tick label numbers, on the two axis? Writing 0.0, 1.0, ... in the code do nothing.

4. How to change the marker's color, and the error bars color separately? For example the markers in blue, and all the error bars in red?

Recommended reading 2024:

LaTeXguide.org • LaTeX-Cookbook.net • TikZ.org

NEW: TikZ book now 40% off at Amazon.com for a short time.

Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

The answer to question 4 above is simply to add the following option :

Code: Select all

error bar style={red,thin} % ultra thin, very thin, thick, ...
EDIT : The answer to question 3 appears to be this, added to the axis options :

Code: Select all

xticklabel style={/pgf/number format/precision=1,/pgf/number format/fixed zerofill},
That one was not obvious!
mas
Posts: 226
Joined: Thu Dec 04, 2008 4:39 am

pgfplots : How to add a linear regression line?

Post by mas »

1. How to modify the code above to add a linear regression line, with options for its style (color and thickness)?
Just add another addplot command:

Code: Select all

\addplot+[no marks,red, thick] {1.2304 * x + 0.5121 } ;
2. How to show its equation (with many digits)? and to change its "X" and "Y" symbols? (so the equation is using exactly the same symbols as on the axis).
The usual tikz node command should do the trick.

Code: Select all

\node at (1.5,0) {$Y=1.2304 X + 0.5121} ;

OS: Debian/GNU Linux; LaTeX System : TeXLive; Editor : Vim
User avatar
Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

This is working great. Thanks a lot for the trick.

I've found another way, more complicated, but which has the advantage to calculate the proper trend line from the data points. The legend could also give the proper linear equation with any number of digits. I may give the code later, but it's much more complicated than your solution.
mas
Posts: 226
Joined: Thu Dec 04, 2008 4:39 am

pgfplots : How to add a linear regression line?

Post by mas »

Good to hear that your problem is solved.

pgfplots can do regression and draw the appropriate trend line. Since that was not the question, I did not suggest it :D

Good idea to post your solution when you find time.

OS: Debian/GNU Linux; LaTeX System : TeXLive; Editor : Vim
User avatar
Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

I'm almost getting it right! Here's a complete code (a bit complex. How can I simplify it ?) :

Code: Select all

\documentclass[11pt,letterpaper,twoside]{article}
\usepackage[total={6.5in,10in},left=1in,top=0.5in,includehead,includefoot]{geometry}
\usepackage{pgfplots,pgfplotstable}
\pgfplotsset{compat=newest}

\begin{document}

\begin{center}
	\begin{tikzpicture}
	\begin{axis}[
		height=12cm,
		width=15cm,
		grid=both,
		tick align=inside,
		minor tick num=4,
		major tick style={black,thin},
		minor tick style={black},
		major grid style={color=gray!60,densely dashed},
		minor grid style={color=gray!50,densely dotted},
		tick label style={font=\footnotesize},
		label style={font=\normalsize},
		xlabel=$X$,
		ylabel=$Y$,
		xmin=-0.5,
		xmax=2.5,
		ymin=-1,
		ymax=6,
		title={Put an hilarious title here},
		xtick={-1, -0.5, 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5},
		ytick={-1, 0, 1, 2, 3, 4, 5, 6},
		xticklabel style={/pgf/number format/precision=2,/pgf/number format/fixed zerofill},
		yticklabel style={/pgf/number format/precision=0,/pgf/number format/fixed zerofill},
		legend pos=south east,
		legend style={empty legend}
	]
	\addplot[
		blue,
		mark=*,
		mark size=1.20,
		only marks,
		error bars/.cd,
		x dir=both,
		y dir=both,
		x explicit,
		y explicit,
		error bar style={black,semithick}
	]
	coordinates{
		(0, 0) +- (0.1, 0.3)
		(0.5, 1) +- (0.2, 0.15)
		(1.1, 2.3) +- (0.2, 0.4)
		(1.55, 3.25) +- (0.06, 0.5)
		(2, 4.7) +- (0.3, 0.3)};
	% Average slope line :
	\addplot[black,thick,empty legend] table[y={create col/linear regression={y=Y}}]
	{
		X Y
		0 0
		0.5 1
		1.1 2.3
		1.55 3.25
		2 4.7
	};
		\addlegendentry[/pgf/number format/precision=6]{
		$\ell_{ave} = \pgfmathprintnumber[]{\pgfplotstableregressiona} \, m
		\pgfmathprintnumber[print sign]{\pgfplotstableregressionb}$
	};
	% Maximal slope line :
	\addplot[
		black,
		thick,
		mark=*,
		mark size=1.3,
		mark options={fill=white},
		only marks
	]
	coordinates{
		(0.1, -0.41763)
		(1.8, 5.009598)
	};
	\addplot[thick,red,empty legend] table[y={create col/linear regression={y=Y}}]
	{
		X Y
		0.1 -0.41763
		1.8 5.009598
	};
		\addlegendentry[/pgf/number format/precision=6]{
		$\ell_{max} = \pgfmathprintnumber[]{\pgfplotstableregressiona} \, m
		\pgfmathprintnumber[print sign]{\pgfplotstableregressionb}$
	};
	% Minimal slope line :
	\addplot[
		black,
		thick,
		mark=*,
		mark size=1.3,
		mark options={fill=white},
		only marks
	]
	coordinates{
		(-0.1, 0.18237)
		(2.4, 4.409598)
	};
	\addplot[thick,olive,empty legend] table[y={create col/linear regression={y=Y}}]
	{
		X Y
		-0.1 0.18237
		2.4 4.409598
	};
	\addlegendentry[/pgf/number format/precision=6]{
		$\ell_{min} = \pgfmathprintnumber[]{\pgfplotstableregressiona} \, m
		\pgfmathprintnumber[print sign]{\pgfplotstableregressionb}$
	};
	\end{axis}
	\end{tikzpicture}
\end{center}

\end{document}
Here's an hilarious preview ( :| ), with an unsolved problem indicated in red :
graph.jpg
graph.jpg (53.41 KiB) Viewed 16068 times
The problem are the numbers in the three linear trend equations in the legend. How can I tell LaTeX to show the proper numbers calculated for each trend line? Currently, it's only repeating the proper numbers of the third line only (the minimal slope line).
User avatar
Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

pgfplot Manual says on page 394 that the numbers pgfplotstableregressiona and pgfplotstableregressionb are stored globally. This explains why my equations are showing the same numbers for all three lines. How can I show the numbers properly? The manual isn't clear about this.
User avatar
Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

Ahaa! I think I get it. From page 395 of the manual, I need to use the optional commands

Code: Select all

\xdef\slopeA{\pgfplotstableregressiona}
\xdef\bA{\pgfplotstableregressionb}
then use the new commands \slopeA and \bA instead of \pgfplotstableregressiona and \pgfplotstableregressionb in my equation (in the legend). It appears to work nicely.
User avatar
Stefan Kottwitz
Site Admin
Posts: 10340
Joined: Mon Mar 10, 2008 9:44 pm

pgfplots : How to add a linear regression line?

Post by Stefan Kottwitz »

Interesting!

Just a short remark: I noticed that you wrote the indices ave, max and min in italic. It's common practice to write variables in italic, but operators (and such names) in upright shape. That's why there are commands \max and \min. Also units are commonly written upright, so they are not mistaken for variables (m for meter instead of the variable m).

Stefan
LaTeX.org admin
User avatar
Cham
Posts: 937
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Post by Cham »

Yes, I agree Stefan. I used italic ave min and max in the example above just to simplify things. This code is already heavy enough.
Post Reply