Skip to main content

image processing - Deleting noisy data from a plot (manually) and export the best remaining data


I have the following temperature time-series. As you can see it contains very very noisy points in some periods (Fig.A). I tried to approach the best data(the black one in the center) based on the idea mentioned here: How to remove outliers from data but I got weak time-series as demonstrated in the pink dots (Fig.c).


Therefore, what come finally to my mind is to manually delete the unwanted signals and then export the desired signal data and interpolate the missing data by tracking the deleted time steps. But I need help to do this, or if there is very powerful approach to eliminate the unwanted signals might be a solution (mathematical or image processing by Mathematcia). Thanks in advanced.enter image description here


data:https://drive.google.com/file/d/12I7HYE2cjEFUJF3zQIgeCx8O_Xqtviaa/view


P.S. I do not have any duplicate in my data i.e. each time step has only one value.



Answer



Introduction


It seems to me that this question should be answered using more "traditional" time series methods than the already provided interesting solutions (with graphs and image processing.)


The workflow shown below is something considered during the design of the QRMon package and it is very similar to the data cleaning done in "Cleaning away data points which are enveloped within a function".


The "traditional" time series procedure





  1. Summarize the data




  2. Do a (Quantile Regression) fit.




  3. Pick points close to the fitted curve.




    • Using an appropriate threshold.




  4. Plot the picked points.




  5. If satisfactory results stop, else use the picked points as new data and goto to 1.





Get the data


Actually the other answers did not discuss how the data is obtained. I downloaded the data from the provided link and had to pre-process it a bit.


data = Import["~/Downloads/MSE-q188361.txt", "Data"];

Tally[Length /@ data]
(* {{5, 1704}, {4, 34}} *)

data = Select[data, Length[#] == 5 &];
data = data[[All, 1 ;; 4]];
data = Select[data, VectorQ[#, NumberQ] &];

Dimensions[data]
(* {1703, 4} *)

Workflow code


The implementation below uses the package QRMon:


Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/MonadicQuantileRegression.m"]

and Fold. Only two interations are needed, but I did experiment with different regression fits (algorithms, function bases parameters, and algorithm options) and different point picking thresholds.


The data is quite skewed, so the built-in function Fit does not work that well. The Quantile Regression algorithm is somewhat slow, but the whole computation should finish within 15 seconds.


AbsoluteTiming[

cleanData =
Fold[
First[Values[
QRMonUnit[#1]⟹
QRMonEcho[Style[Row[{"Iteration parameters:\n{number of knots, quantile, pick threshold}=", #2}], Bold, Purple, FontSize -> 16]]⟹
QRMonEchoDataSummary⟹
QRMonQuantileRegression[#2[[1]], #2[[2]], Method -> {LinearProgramming, Method -> "InteriorPoint", Tolerance -> 10^(-3)}]⟹
QRMonSetRegressionFunctionsPlotOptions[{PlotStyle -> Red}]⟹
QRMonPlot[ImageSize -> Large, PlotLabel -> Style["Data and fit", Bold, 16]]⟹
QRMonPickPathPoints[#2[[3]]]⟹

QRMonEchoFunctionValue[ListPlot[#, ImageSize -> Large, PlotLabel -> Style["Picked points", Bold, 16], PlotTheme -> "Detailed"] & /@ # &]⟹
QRMonTakeValue
]] &,
Join @@ data, {{16, 0.3, 0.1}, {30, 0.5, 0.025} (*,{24, 0.5, 0.01}*)}];
]

enter image description here


enter image description here


The final result is given to the variable cleanData:


Short[cleanData]

(* {{2, 5.3698}, {4, 5.3698} <<4563>> {6809, 4.813}, {6811, 4.813}) *)

Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...