LaTeX forum ⇒ Graphics, Figures & Tablespgfplots : How to add a linear regression line? Topic is solved

Information and discussion about graphics, figures & tables in LaTeX documents.
User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

pgfplots : How to add a linear regression line?

Postby Cham » Thu Dec 14, 2017 5:40 pm

I'm studying the pgfplots package to make some cool graphics using experimental data. I usually use Excel for this kind of stuff, but I'm interested in doing the same directly with LaTeX.

The trouble with most LaTeX documentations is that they're extremely heavy, and their examples are too often "out of this world". I'm having difficulties in finding the proper setup to add a linear regression line, and to customize the markers, and the tick numbers. Here's a MWE to work with :
  1. \documentclass[11pt,letterpaper,twoside]{article}
  2. \usepackage[total={6.5in,10in},left=1in,top=0.5in,includehead,includefoot]{geometry}
  3. \usepackage{pgfplots}
  4. \pgfplotsset{compat=newest}
  5.  
  6. \begin{document}
  7.  
  8. Some cute data graphic :
  9.  
  10. \begin{center}
  11. \begin{tikzpicture}
  12. \begin{axis}[
  13. height=9cm,
  14. width=9cm,
  15. grid=both,
  16. tick align=inside,
  17. minor tick num=3,
  18. major tick style={black,thin},
  19. minor tick style={black},
  20. major grid style={color=gray!60,densely dashed},
  21. minor grid style={color=gray!50,densely dotted},
  22. tick label style={font=\footnotesize},
  23. label style={font=\normalsize},
  24. xlabel=$X$,
  25. ylabel=$Y$,
  26. title={Some title},
  27. xtick={-1, 0, 1, 2, 3, 4, 5},
  28. ytick={-1, 0, 1, 2, 3, 4, 5, 6}
  29. ]
  30. \addplot[blue,mark=*,mark size=1.10,only marks,error bars/.cd,x dir=both,y dir=both,x explicit,y explicit]
  31. coordinates {
  32. (0, 0) +- (0.1, 0.4)
  33. (0.5, 1) +- (0.4, 0.2)
  34. (1, 2) +- (0.2, 0.4)
  35. (1.55, 3.25) +- (0.15, 0.3)
  36. (2, 5) +- (0.3, 0.3)};
  37. \end{axis}
  38. \end{tikzpicture}
  39. \end{center}
  40.  
  41. \end{document}

Here's a preview, with a few things shown in red that I want to add in the plot :
graphic.jpg
graphic.jpg (30.4 KiB) Viewed 1324 times


1. How to modify the code above to add a linear regression line, with options for its style (color and thickness)?

2. How to show its equation (with many digits)? and to change its "X" and "Y" symbols? (so the equation is using exactly the same symbols as on the axis).

3. How to add more digits to the tick label numbers, on the two axis? Writing 0.0, 1.0, ... in the code do nothing.

4. How to change the marker's color, and the error bars color separately? For example the markers in blue, and all the error bars in red?

User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

Postby Cham » Thu Dec 14, 2017 6:36 pm

The answer to question 4 above is simply to add the following option :
  1. error bar style={red,thin} % ultra thin, very thin, thick, ...


EDIT : The answer to question 3 appears to be this, added to the axis options :
  1. xticklabel style={/pgf/number format/precision=1,/pgf/number format/fixed zerofill},

That one was not obvious!

mas
Posts: 206
Joined: Thu Dec 04, 2008 4:39 am

Postby mas » Fri Dec 15, 2017 2:45 am

1. How to modify the code above to add a linear regression line, with options for its style (color and thickness)?


Just add another addplot command:
  1. \addplot+[no marks,red, thick] {1.2304 * x + 0.5121 } ;


2. How to show its equation (with many digits)? and to change its "X" and "Y" symbols? (so the equation is using exactly the same symbols as on the axis).


The usual tikz node command should do the trick.
  1. \node at (1.5,0) {$Y=1.2304 X + 0.5121} ;

OS: Debian/GNU Linux; LaTeX System : TeXLive; Editor : Vim

User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

Postby Cham » Fri Dec 15, 2017 4:08 am

This is working great. Thanks a lot for the trick.

I've found another way, more complicated, but which has the advantage to calculate the proper trend line from the data points. The legend could also give the proper linear equation with any number of digits. I may give the code later, but it's much more complicated than your solution.

mas
Posts: 206
Joined: Thu Dec 04, 2008 4:39 am

Postby mas » Fri Dec 15, 2017 5:15 am

Good to hear that your problem is solved.

pgfplots can do regression and draw the appropriate trend line. Since that was not the question, I did not suggest it :D

Good idea to post your solution when you find time.

OS: Debian/GNU Linux; LaTeX System : TeXLive; Editor : Vim

User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

Postby Cham » Fri Dec 15, 2017 4:28 pm

I'm almost getting it right! Here's a complete code (a bit complex. How can I simplify it ?) :
  1. \documentclass[11pt,letterpaper,twoside]{article}
  2. \usepackage[total={6.5in,10in},left=1in,top=0.5in,includehead,includefoot]{geometry}
  3. \usepackage{pgfplots,pgfplotstable}
  4. \pgfplotsset{compat=newest}
  5.  
  6. \begin{document}
  7.  
  8. \begin{center}
  9. \begin{tikzpicture}
  10. \begin{axis}[
  11. height=12cm,
  12. width=15cm,
  13. grid=both,
  14. tick align=inside,
  15. minor tick num=4,
  16. major tick style={black,thin},
  17. minor tick style={black},
  18. major grid style={color=gray!60,densely dashed},
  19. minor grid style={color=gray!50,densely dotted},
  20. tick label style={font=\footnotesize},
  21. label style={font=\normalsize},
  22. xlabel=$X$,
  23. ylabel=$Y$,
  24. xmin=-0.5,
  25. xmax=2.5,
  26. ymin=-1,
  27. ymax=6,
  28. title={Put an hilarious title here},
  29. xtick={-1, -0.5, 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5},
  30. ytick={-1, 0, 1, 2, 3, 4, 5, 6},
  31. xticklabel style={/pgf/number format/precision=2,/pgf/number format/fixed zerofill},
  32. yticklabel style={/pgf/number format/precision=0,/pgf/number format/fixed zerofill},
  33. legend pos=south east,
  34. legend style={empty legend}
  35. ]
  36. \addplot[
  37. blue,
  38. mark=*,
  39. mark size=1.20,
  40. only marks,
  41. error bars/.cd,
  42. x dir=both,
  43. y dir=both,
  44. x explicit,
  45. y explicit,
  46. error bar style={black,semithick}
  47. ]
  48. coordinates{
  49. (0, 0) +- (0.1, 0.3)
  50. (0.5, 1) +- (0.2, 0.15)
  51. (1.1, 2.3) +- (0.2, 0.4)
  52. (1.55, 3.25) +- (0.06, 0.5)
  53. (2, 4.7) +- (0.3, 0.3)};
  54. % Average slope line :
  55. \addplot[black,thick,empty legend] table[y={create col/linear regression={y=Y}}]
  56. {
  57. X Y
  58. 0 0
  59. 0.5 1
  60. 1.1 2.3
  61. 1.55 3.25
  62. 2 4.7
  63. };
  64. \addlegendentry[/pgf/number format/precision=6]{
  65. $\ell_{ave} = \pgfmathprintnumber[]{\pgfplotstableregressiona} \, m
  66. \pgfmathprintnumber[print sign]{\pgfplotstableregressionb}$
  67. };
  68. % Maximal slope line :
  69. \addplot[
  70. black,
  71. thick,
  72. mark=*,
  73. mark size=1.3,
  74. mark options={fill=white},
  75. only marks
  76. ]
  77. coordinates{
  78. (0.1, -0.41763)
  79. (1.8, 5.009598)
  80. };
  81. \addplot[thick,red,empty legend] table[y={create col/linear regression={y=Y}}]
  82. {
  83. X Y
  84. 0.1 -0.41763
  85. 1.8 5.009598
  86. };
  87. \addlegendentry[/pgf/number format/precision=6]{
  88. $\ell_{max} = \pgfmathprintnumber[]{\pgfplotstableregressiona} \, m
  89. \pgfmathprintnumber[print sign]{\pgfplotstableregressionb}$
  90. };
  91. % Minimal slope line :
  92. \addplot[
  93. black,
  94. thick,
  95. mark=*,
  96. mark size=1.3,
  97. mark options={fill=white},
  98. only marks
  99. ]
  100. coordinates{
  101. (-0.1, 0.18237)
  102. (2.4, 4.409598)
  103. };
  104. \addplot[thick,olive,empty legend] table[y={create col/linear regression={y=Y}}]
  105. {
  106. X Y
  107. -0.1 0.18237
  108. 2.4 4.409598
  109. };
  110. \addlegendentry[/pgf/number format/precision=6]{
  111. $\ell_{min} = \pgfmathprintnumber[]{\pgfplotstableregressiona} \, m
  112. \pgfmathprintnumber[print sign]{\pgfplotstableregressionb}$
  113. };
  114. \end{axis}
  115. \end{tikzpicture}
  116. \end{center}
  117.  
  118. \end{document}

Here's an hilarious preview ( :| ), with an unsolved problem indicated in red :
graph.jpg
graph.jpg (53.41 KiB) Viewed 1281 times


The problem are the numbers in the three linear trend equations in the legend. How can I tell LaTeX to show the proper numbers calculated for each trend line? Currently, it's only repeating the proper numbers of the third line only (the minimal slope line).

User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

Postby Cham » Fri Dec 15, 2017 5:12 pm

pgfplot Manual says on page 394 that the numbers pgfplotstableregressiona and pgfplotstableregressionb are stored globally. This explains why my equations are showing the same numbers for all three lines. How can I show the numbers properly? The manual isn't clear about this.

User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

Postby Cham » Fri Dec 15, 2017 5:22 pm

Ahaa! I think I get it. From page 395 of the manual, I need to use the optional commands
  1. \xdef\slopeA{\pgfplotstableregressiona}
  2. \xdef\bA{\pgfplotstableregressionb}

then use the new commands \slopeA and \bA instead of \pgfplotstableregressiona and \pgfplotstableregressionb in my equation (in the legend). It appears to work nicely.

User avatar
Stefan Kottwitz
Site Admin
Posts: 8953
Joined: Mon Mar 10, 2008 9:44 pm

Postby Stefan Kottwitz » Sat Dec 16, 2017 2:27 pm

Interesting!

Just a short remark: I noticed that you wrote the indices ave, max and min in italic. It's common practice to write variables in italic, but operators (and such names) in upright shape. That's why there are commands \max and \min. Also units are commonly written upright, so they are not mistaken for variables (m for meter instead of the variable m).

Stefan
Site admin

User avatar
Cham
Posts: 885
Joined: Sat Apr 02, 2011 4:06 pm

Postby Cham » Sat Dec 16, 2017 3:28 pm

Yes, I agree Stefan. I used italic ave min and max in the example above just to simplify things. This code is already heavy enough.


Return to “Graphics, Figures & Tables”

Who is online

Users browsing this forum: Baidu [Spider] and 2 guests