comparing to columns looking for similarity

comparing to columns looking for similarity - c#

I've got a program that is looking at what files have changed in each SVN commit, which will also highlight the area that has changed. I'm passing that data which has been retrieved from SVN and places it into a table in a SQL server database. What I'd like to do is compare the paths to see what area has been effected. I've already got a table which has a path that shows what area has been effected.
Example:
SVN:
branches/Projects/Enhancements2015Q1/WMDB/WMDB.cs
this is the path that has been found by the code
Compare table
branches/Projects/Enhancements2015Q1/GEM4/Utilities/Utilities.csproj = Utilities
branches/Projects/Enhancements2015Q1/WMDB/WMDB.cs = WMDB
trunk/src/GEM 4/GEM4/UI/Forms/AutoRenderOptionsForm.cs = UI
So what I'd like to find is that the path found in SVN has changed the WMDB.

Related

Windows indexed files OLE DB search with like instead of contains

So I have the problem that I need to exchange this contains, with a like in my query. Because when you search with a dash in a word so "12-0430-1" then the contains produces results that also contain only 12 or 0430 or 1. This is intended and also spoken about by microsoft here. Also the solution is in this article, but sadly only in a way that does not help. The solution is to exchange contains with like, but I always get an error with this edited query:
SELECT TOP 2000 System.FileName
FROM systemindex
WHERE DIRECTORY = 'C:\Path...'
AND (System.FullText Like ('%12-0430-1%'))
ORDER BY System.DateCreated DESC
The error says because the column does not exist, but for OLE DB search I only found this older site which specifies which columns can be searched.
Before I did not need it because I used contains:
Ole-DB query: for 22-0130-1 SELECT TOP 2000 System.FileName FROM
systemindex WHERE DIRECTORY = 'C:\Path' AND (CONTAINS
('"12-0430-1*"')) ORDER BY System.DateCreated DESC
So could somebody please link me a page with the current columns that can be searched over windows indexed files or say which column exists and would be working, I want to search through the indexed contents of the file.
Edit these are my settings with which I create the OleDBConnection:
OleDbConnection connection =
new OleDbConnection("Provider=Search.CollatorDSO;Extended Properties='Application=Windows';");

Selecting a specifik directory depending on its name as compared to the Id of an article item

I´m working on a personal side project, a web store, as an ASP.Net Core application with C#.
I'm trying to create an ImageHandler service that is to be used for fetching images that are located in a folder in the project.
I´m trying to structure the images so that they are easy to manage in the project with code, and to be easily overseeable, so I created a bunch of subfolders/directories.
The path looks like this:
WebRoot/Images/Articles/ArticleFolder/ArticleImages.jpg
The Article folder is named Article.id_Article.Name so a complete example would be:
\wwwroot\Images\Articles\1_iPhone 13 Pro Max – 5G smartphone 128GB Silver\1_01.jpg.
I'm trying to write a statement that finds the right folder for a given article by making a selection on the first part of the folder name that is equal to the ID of the article it belongs to.
My first attempt looked like this:
var path = Path.Combine(Enviroment.WebRootPath, "Images", "Articles");
string dir = Directory.GetDirectories(path).First(d => Path.GetFileName(d).StartsWith(articleId.ToString()));
But if I where looking for a image with the id of 1, the line would give me a directory starting with 10_article.Name so obviously that wouldn't do.
I then tried:
string dir2 = Directory.GetDirectories(path).First(d => Path.GetFileName(d).Where(d => d.Split('_')[0] == articleId.ToString()));
I thought that would work, but I'm getting an error on the d.Split('_')[0]. It claims that d is now an char and therefore the method split can't be used. I understand that you somehow can save images as binary data in a SQL database and to keep images in the project folder might be dumb, as is my naming convention, but I'm curious how you could solve this. Any insight?

As I understand it, your structure is "/webroot/Images/Articles" (which remains constant), combined with /{articleId}+"_"+{articleName}/{imageName}, where the items in curly braces are variable, correct? How about something like this?
string imageRootFolder = Path.Combine(Enviroment.WebRootPath, "Images", "Articles");
string articleDir = Directory.GetDirectories(imageRootFolder).Where(dir => dir.StartsWith(articleId.ToString() + "_")).First();
List<string> pathsToImages = Directory.GetFiles(Path.Combine(imageRootFolder, articleDir)).ToList();

How do I include Images in my Database in Visual Studio 2010

I'm fairly new to C# .Net. We're being taught it at University and are using Visual Studio to create windows forms. As a new part to the subject we're using databases, tables and datasets.
I opened a new Windows Form project and immediately added a new database to it. The table I want to create will have 2 columns - ImageID and the image itself. In what way do i add the image in to the box? I've tried full path, relative path and dragging the image in, but whatever I do i get the same error message....
Invalid Value
The changed value in this cell was not recognized as being valid.
.Net Framework Data Type: Byte[]
Error Message: You cannot use the result pane to set this Field data to
values other than NULL
Type a value appropriate for the data type or press ESC to cancel the
change
How can I have images in there? I just don't know how to use the image data type within the table. Any help is much appreciated.

A simpler approach is to store the image in the file system and only its path in the database. Basically you define a base folder:
string baseFolder = "c:\Program Files\MyApp\Images";
And use it to store relative paths in the database:
INSERT INTO ImagesTable (Name, Path)
Values ('German Shepherd', 'Dogs\german-shepherd.jpg')
Then, when you need to retrieve the image, you can do it like this:
string path = Path.Combine(baseFolder, 'Dogs\german-shepherd.jpg');
Image img = Image.FromFile(path);
In the following SO question you can find more information about the pros and cons of this approach:
Should I store my images in the database or folders?

you can store images in sql server 2008. Just create database table having column datatype "image".
now from .net code use the file upload control to select the image file and then convert the image parameter into byte[] before inserting image data into the database.

SSIS - How do I load data from text files where the path of files is inside another text file?

I have a text file that contains a list of files to load into database.
The list contains two columns:
FilePath,Type
c:\f1.txt,A
c:\f2.txt,B
c:\f3.txt,B
I want to provide this file as the source to SSIS. I then want it to go through it line by line. For each line, I want it to read the file in the FilePath column and check the Type.
If type is A then I want it to ignore the first 4 lines of the file that is located at the FilePath column of the current line and then load rest of the data inside that file in a table.
If type is B then I want it to open the file and copy first column of the file into table 1 and second column into table 2 for all of the lines.
I would really appreciate if someone can please provide me a high level list of steps I need to follow.
Any help is appreciated.

Here is one way of doing it within SSIS. Below steps are with respect to SSIS 2008 R2.
Create an SSIS package and create three package variables namely FileName, FilesToRead and Type. FilesToRead variable will hold the list of files and their types information. We will have a loop that will go through each of those records and store the information in FileName and Type variables every time it loops through.
On the control flow tab, place a Data flow task followed by a ForEach Loop container. The data flow task would read the file containing the list of files that has to be processed. The loop would then go through each file. Your control flow tab would finally look something like this. For now, there will be errors because nothing is configured. We will get to that shortly.
On the connection manager section, you need four connections.
First, you need an OLE DB connection to connect to the database. Name this as SQLServer.
Second, a flat file connection manager to read the file that contains the list of files and types. This flat file connection manager will contain two columns configured namely FileName and Type Name this as Files.
Third, another flat file connection manager to read all files of type A. Name this as Type_A. In this flat file connection manager, enter the value 4 in the text box Header rows to skip so that the first four rows are always skipped.
Fourth, one more flat file connection manager to read all files of type B. Name this as Type_B.
Let's get back to control flow. Double-click on the first data flow task. Inside the data flow task, place a flat file source that would read all the files using the connection manager Files and then place a Recordset Destination. Configure the variable FilesToRead in the recordset destination. Your first data flow task would like as shown below.
Now, let's go back to control flow tab again. Configure the ForEach loop as shown below. This loop will go through the recordset stored in the variable FilesToRead. Since, the recordset contains two columns, each time a record is looped through, the variables FileName and Type will be assigned the value of the current record.
Inside, the for each loop container, there are two data flow tasks namely Type A files and Type B files. You can configure each of these data flow tasks according to your requirements to read the files from connection managers. However, we need to disable the tasks based on the file that is being read.,
Type A files data flow task should be enabled only when A type files are being processed.
Similarly, Type B files data flow task should be enabled only when B type files are being processed.
To achieve this, click on the Type A files data flow task and press F4 to bring the properties. Click on the Ellipsis button available on the Expression property.
On the Property Expressions Editor, select Disable Property and enter the expression !(#[User::Type] == "A")
Similarly, click on the Type B files data flow task and press F4 to bring the properties. Click on the Ellipsis button available on the Expression property.
On the Property Expressions Editor, select Disable Property and enter the expression !(#[User::Type] == "B")
Here is a sample Files.txt containing only A type file in the list. When the package is executed to read this file, you will notice that only the Type A files data flow task.
Here is another sample Files.txt containing only B type files in the list. When the package is executed to read this file, you will notice that only the Type B files data flow task.
If Files.txt contains both A and B type files, the loop will execute the appropriate data flow task based on the type of file that is being processed.
Configuring Data Flow task Type A files
Let's assume that your flat files of type A have three column layout like as shown below with comma separated values. The file data here is shown using Notepad++ with all special characters. CR LF denotes that the lines are ending with Carriage return and Line Feed. This file is stored in the path C:\f1.txt
We need a table in the database to import the data. Let's create a table named dbo.Table_A in the SQL Server database as shown here.
Now, go to the SSIS package. Here are the details to configure the Flat File connection manager named Type_A. Give a name to the connection manager. You need specify the value 4 in the Header rows to skip textbox. Your flat file connection manager should look something like this.
On the Advanced tab, you can rename the column names if you would like to.
Now that the connection manager is configured, we need to configure data flow task Type A files to process the corresponding files. Double-click on the data flow task Type A files. Place a Flat file source and OLE DB Destination inside the task.
The flat file source has to be configured to read the files from flat file connection manager.
The data flow task doesn't do anything special. It simply reads the flat files of type A and inserts the data into the table dbo.Table_A. Now, we need to configure the OLE DB Destination to insert the data into database. The column names configured in the flat file connection manager and the table are not same. So, they have to be mapped manually.
Now, that the data flow task is configured. We have to make that the file path being read from the Files.txt is passed correctly. To do this, click on the Type_A flat file connection manager and press F4 to bring the properties. Set the DelayValidation property to True. Click on the Ellipsis button on the Expressions property.
On the Property Expression builder, select ConnectionString property and set it to the Expression #[User::FileName]
Here is a sample Files.txt file containing Type A files only.
Here are the sample type A files f01.txt and f02.txt
After the package execution, following data will be found in the table Table_A
Above mentioned configuration steps have to be followed for Type B files. However, the data flow task would look slightly different since the file processing logic is different. Data flow task Type B files would something like this. Since you have to insert the two columns in type B files into different tables. You have to use Multicast transformation that would create clones of the input data. You could use each of the multicast output to pass through to a different transformation or destination.
Hope that helps you to achieve your task.

I would recommend that you create a SSIS package for each different type of file load you're going to do. You can execute those packages from another program, see here: How to execute an SSIS package from .NET?
Given this information, you can write a quick program to execute the relevant packages:
var jobs = File.ReadLines("C:\\temp\\order.txt")
.Skip(1)
.Select(line => line.Split(','))
.Select(tokens => new { File = tokens[0], Category = tokens[1] });
foreach (var job in jobs)
{
// execute the relevant package for job.Category using job.File
}

My solution would look like N + 1 flat file Connection Managers to handle the source files. CM A would address the skip first 4 rows file format, B sounds like it's just a 2 column file, etc. The last CM would be used to parse the command file you've illustrated.
Now that you have all of those Connection Managers defined, you can go about the processing logic.
Create 3 variables. 2 of type string (CurrentPath, CurrentType). 1 is of type Object and I called it Recordset.
The first Data Flow reads all the rows from the flat file source using "CM Control." This is the data you supplied in your example.
We will then use that Recordset object as the source for a ForEach Loop Container in what is commonly referred to as shredding. Bingle the term "Shred recordset ssis" and you're bound to hit a number of articles describing how to do it. The net result is that for each row in that source CM Control file, you will assign those values into the CurrentPath, CurrentType variables.
Inside that Loop container, create a central point for control for control to radiate out. I find a script task works wonderfully for this. Drag it onto the canvas, give it a strong name to indicate it's not used for anything and then create a data flow to handle each processing permutation.
The magic comes from using Expressions. Dang near everything in SSIS can have expressions set on their properties which is what separates the professionals from the poseurs. Here, we will double click on the line connecting to a given data flow and change the constraint type from "Constraint" to "Expression and Constraint" The Expression you would then use is something like #[User::CurrentType] == "A" This will ensure that path is only taken when both the parent task Succeeded and the condition is true.
The second bit of expression magic will be applied to the connection managers themselves. They will need to have their ConnectionString property driven by the value of the #[User::CurrentFile] property. This will allow a design-time value of C:\filea.txt but would allow a runtime value, from the control file, to be \\network\share\ClientFileA.txt Unless all the files have the same structure, you'll most likely need to set DelayValidation to True in the properties. Otherwise, SSIS will fail PreValidation as all the "CM A" to "CM N" would be using that CurrentFile variable which may or may not be a valid connection string for that file layout.

How to validate column before importing into database

I am a complete newbie to SSIS.
I have a c#/sql server background.
I would like to know whether it is possible to validate data before it goes into a database. I am grabbing text from a |(pipe) delimited text file.
For example, if a certain datapoint is null, then change it to 0 or if a certain datapoint's length is 0, then change to "nada".
I don't know if this is even possible with SSIS, but it would be most helpful if you can point me into the right direction.

anything is possible with SSIS!
after your flat file data source, use a Derived Column Transformation. Deriving a new column with the expression being something like the following.
ISNULL(ColumnName) ? "nada" : ColumnName
Then use this new column in your data source destination.
Hope it helps.

I don't know if you're dead set on using SSIS, but the basic method I've generally used to import textfile data into a database generally takes two stages:
Use BULK INSERT to load the file into a temporary staging table on the database server; each of the columns in this staging table are something reasonably tolerant of the data they contain, like a varchar(max).
Write up validation routines to update the data in the temporary table and double-check to make sure that it's well-formed according to your needs, then convert the columns into their final formats and push the rows into the destination table.
I like this method mostly because BULK INSERT can be a bit cryptic about the errors it spits out; with a temporary staging table, it's a lot easier to look through your dataset and fix errors on the fly as opposed to rooting through a text file.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.