c# - How do I find specific matches using regex and put them in a string array? -


i have html file i'm trying extract data from. regex i'm using is

"<tr.+?>.+?<td class=\"table_row_col2\"><b>(.+?)&.+?</b>.+?<td class=\"table_row_col5\">(.+?)</td>.+?<td class=\"table_row_col6\">(.+?)</td>.+?</tr>" 

it works in python not in c#. here's sample data:

<tr class="table_row" style="background-color: #d3d3d3;">     <td class="table_row_col1">271</td>     <td class="table_row_col2"><b>16/09/2015&nbsp;05:28&nbsp;pm</b></font></small></sup></td>     <td class="table_row_col3"><span style="color:#e30613">14.3</span></td>     <td class="table_row_col4">-</td>     <td class="table_row_col5">8</td>     <td class="table_row_col6">-</td>     <td class="table_row_col7">-</td>     <td class="table_row_col8">before dinner</td>     <td class="table_row_col9">-</td>     <td class="table_row_col10">-</td>     <td class="table_row_col11">-</td> </tr>  <tr class="table_row" style="background-color: #ffffff;">     <td class="table_row_col1">272</td>     <td class="table_row_col2"><b>16/09/2015&nbsp;02:54&nbsp;pm</b></font></small></sup></td>     <td class="table_row_col3"><span style="color:#e30613">17.6</span></td>     <td class="table_row_col4">-</td>     <td class="table_row_col5">20</td>     <td class="table_row_col6">32</td>     <td class="table_row_col7">-</td>     <td class="table_row_col8">other</td>     <td class="table_row_col9">-</td>     <td class="table_row_col10">-</td>     <td class="table_row_col11">-</td> </tr>  <tr class="table_row" style="background-color: #d3d3d3;">     <td class="table_row_col1">273</td>     <td class="table_row_col2"><b>15/09/2015&nbsp;11:09&nbsp;pm</b></font></small></sup></td>     <td class="table_row_col3">-</td>     <td class="table_row_col4">-</td>     <td class="table_row_col5">-</td>     <td class="table_row_col6">34</td>     <td class="table_row_col7">-</td>     <td class="table_row_col8">before bed</td>     <td class="table_row_col9">-</td>     <td class="table_row_col10">-</td>     <td class="table_row_col11">-</td> </tr> 

i'm trying extract date table_row_col2 , numbers table_row_col5 , table_row_col6

if know html never changes can adding class split:

list<string> rows = split.extract(htmlstring, "class=\"table_row\"", "</tr>"); foreach (string row in rows) {     string col2 = split.extract(row, "class=\"table_row_col2\"><b>", "</b>")[0];     string col5 = split.extract(row, "class=\"table_row_col5\">", "</td>")[0];     string col6 = split.extract(row, "class=\"table_row_col6\">", "</td>")[0];      console.writeline(col2 + ", " + col5 + ", " + col6); } 

additional class split:

public class split {     public static list<string> extract(string source, string splitstart, string splitend)     {         try         {             var results = new list<string>();              string[] start = new string[] { splitstart };             string[] end = new string[] { splitend };             string[] temp = source.split(start, stringsplitoptions.none);              (int = 1; < temp.length; i++)             {                 results.add(temp[i].split(end, stringsplitoptions.none)[0]);             }              return results;         }         catch (exception e)         {             throw new exception(e.message);         }     } } 

Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -

Python Pig Latin Translator -