Splitting Data into Multiple Fields - Access

Splitting Data into Multiple Fields - Access

Author
Discussion

Mikeyplum

Original Poster:

1,646 posts

171 months

Wednesday 17th October 2012
quotequote all
My inferior brain has almost entirely melted through my ears. banghead

I have a Field in Access called FULL NAME, which consists of complete names:

e.g. MR&MRS JOHN J & JANE J DOUGH

I have already managed to split the Title and the Last Name into seperate fields using:

Title: Left([Debt By Rating Area]![c_FullName],InStr(1,[Debt By Rating Area]![c_FullName]," ")-1)

Last Name: Right([Debt By Rating Area]![c_FullName],Len([Debt By Rating Area]![c_FullName])-InStrRev([Debt By Rating Area]![c_FullName]," "))

Does anyone know how I can get everything in the middle into a seperate field (i,e the JOHN J & JANE J) bit.

FYI - not all the data will be the same. There may be names such as MR J JONES etc.

Cheers!
slowly gathers melted brain from keyboard

Du1point8

21,618 posts

194 months

Wednesday 17th October 2012
quotequote all
if its always a &, etc

http://www.techonthenet.com/access/questions/parse...

parsing a string and looking for a delimiter (& in this case)

is there anytime it would fall out of this assumption?

If there is no & it wouldnt split it.

Mikeyplum

Original Poster:

1,646 posts

171 months

Wednesday 17th October 2012
quotequote all
Du1point8 said:
if its always a &, etc

http://www.techonthenet.com/access/questions/parse...

parsing a string and looking for a delimiter (& in this case)

is there anytime it would fall out of this assumption?

If there is no & it wouldnt split it.
Not always.

It might just be the one name, like MR JOHN DOUGH, or similar variances.

crazy about cars

4,454 posts

171 months

Wednesday 17th October 2012
quotequote all
That would be very hard due to the variation of the "middle" bit but not impossible smile Assuming that your single field has a common pattern which is :

[Title] (space) [First/Middle Name] (space) [Last Name]

Simple explanation would be to get the first position and last position of the middle part. You can do this by counting chars from end of field until first space and repeat by counting from start until first space.

It's a fairly complex process and could be resource intensive. I also wouldn't recommend using the way you have posted to split. I would prefer using a query to achieve your goals smile

edit: there can be spaces in the middle bit as long as there's a space before Last Name and after Title.

Mikeyplum

Original Poster:

1,646 posts

171 months

Wednesday 17th October 2012
quotequote all
crazy about cars said:
That would be very hard due to the variation of the "middle" bit but not impossible smile Assuming that your single field has a common pattern which is :

[Title] (space) [First/Middle Name] (space) [Last Name]

Simple explanation would be to get the first position and last position of the middle part. You can do this by counting chars from end of field until first space and repeat by counting from start until first space.

It's a fairly complex process and could be resource intensive. I also wouldn't recommend using the way you have posted to split. I would prefer using a query to achieve your goals smile
Thanks.

Unfortunatley, not all of the records have the same "patter", if you like.

Some may be - [Title] (space) [First name] (space) (&) (space) [First name] (space) [Last Name)

It's a bit of a PITA to be honest!

Ideally, I would like it split:

Field 1 - Title
Field 2 - First and Middle Name
Field 3 - Last Name

Not an expert on Access, so not sure where you are coming from. The code in the OP is copied and pasted from a Query. I daren't enter into the realms of SQL, not with my limited knowledge! hehe

KaraK

13,207 posts

211 months

Wednesday 17th October 2012
quotequote all
crazy about cars said:
That would be very hard due to the variation of the "middle" bit but not impossible smile Assuming that your single field has a common pattern which is :

[Title] (space) [First/Middle Name] (space) [Last Name]

Simple explanation would be to get the first position and last position of the middle part. You can do this by counting chars from end of field until first space and repeat by counting from start until first space.

It's a fairly complex process and could be resource intensive. I also wouldn't recommend using the way you have posted to split. I would prefer using a query to achieve your goals smile
^this


I had to do a similar one a few years back in ASP - I don't have the code to hand but IIRC what I did was split the string into an array based on the space character then store the first and last elements as the title/lastname and then put all the remaining elements back together as a string (with appropriate spaces) and that was the firstname(s)

Ran like st but as it was a quick and dirty one off I wasn't too fussed.


Something like:

arrNames = split(strName," ")
strTitle = arrNames(0)
strLastname = arrames(Ubound(arrNames))
strFirstname = Join(arrNames," ")


ETA: Doing in in SQL might be faster and I know how I'd do it in T-SQL but doubt it would work in Access frown

crazy about cars

4,454 posts

171 months

Wednesday 17th October 2012
quotequote all
The option I gave does split it into 3 fields smile You can have as many spaces in the middle part but it will work as long as you have a space after title and also a space before starting the last name.

So for example : Mr&Mrs John & Jane Mary Doe would be split to Field 1 : Mr&Mrs Field 2 : John & Jane Mary Field 3 : Doe.

From your code it seems like you are doing this at runtime ie assigning it to some text boxes in a form. Do you just want to view the split or do you want to cleanse some sort of a table to split them? Just need to understand what are you trying to achieve smile

Mikeyplum

Original Poster:

1,646 posts

171 months

Wednesday 17th October 2012
quotequote all
crazy about cars said:
The option I gave does split it into 3 fields smile You can have as many spaces in the middle part but it will work as long as you have a space after title and also a space before starting the last name.

So for example : Mr&Mrs John & Jane Mary Doe would be split to Field 1 : Mr&Mrs Field 2 : John & Jane Mary Field 3 : Doe.

From your code it seems like you are doing this at runtime ie assigning it to some text boxes in a form. Do you just want to view the split or do you want to cleanse some sort of a table to split them? Just need to understand what are you trying to achieve smile
Sorry, just saw your edit note about the spaces.

Sounds perfect.

I have a table in Access (Master Table), which has the field of "Full Name". I have added two more columns "Title" and "Last Name" (as thats as far as I got). I would now like to add another column "First Name(s)" to add the rest. So far, I have ran a Query to split the names. Then created an Update Query, to update the Master Table with the split data.

Does that make sense?

Thanks again!

crazy about cars

4,454 posts

171 months

Wednesday 17th October 2012
quotequote all
Mikeyplum said:
Sorry, just saw your edit note about the spaces.

Sounds perfect.

I have a table in Access (Master Table), which has the field of "Full Name". I have added two more columns "Title" and "Last Name" (as thats as far as I got). I would now like to add another column "First Name(s)" to add the rest. So far, I have ran a Query to split the names. Then created an Update Query, to update the Master Table with the split data.

Does that make sense?

Thanks again!
Yup that does make sense smile You can split the name in your Update Query if you create a function for it. Just stick the function in your "update to" field in the query. I guess this is just a cleansing operation and in the future your front end (user forms) should have separate textboxes to store those names?

I had the same task to sometime ago. The company prefers displaying full names and I find it more efficient to create a query with a custom field which concatenates first and last names for that purpose!

edit : Look into a VBA function called "InStrRev" which should make things easier
edit2 : Looks like you already know about that smile


Edited by crazy about cars on Wednesday 17th October 14:02


Edited by crazy about cars on Wednesday 17th October 14:03

plasticpig

12,932 posts

227 months

Wednesday 17th October 2012
quotequote all
So how does "Captain Leone Sextus Denys Oswolf Fraudatifilius Tollemache-Tollemache de Orellana Plantagenet Tollemache-Tollemache" fit in to your schemes of splitting names up? Without loads of programming you will not get 100% accuracy because people have double-barreled surnames with spaces in them and then you have surnames like de Souza. It is not a trivial task.





Captain Leone Sextus Denys Oswolf Fraudatifilius Tollemache-Tollemache de Orellana Plantagenet Tollemache-Tollemache is a real persons name











Mikeyplum

Original Poster:

1,646 posts

171 months

Wednesday 17th October 2012
quotequote all
crazy about cars said:
Yup that does make sense smile You can split the name in your Update Query if you create a function for it. Just stick the function in your "update to" field in the query. I guess this is just a cleansing operation and in the future your front end (user forms) should have separate textboxes to store those names?

I had the same task to sometime ago. The company prefers displaying full names and I find it more efficient to create a query with a custom field which concatenates first and last names for that purpose!

edit : Look into a VBA function called "InStrRev" which should make things easier
edit2 : Looks like you already know about that smile


Edited by crazy about cars on Wednesday 17th October 14:02


Edited by crazy about cars on Wednesday 17th October 14:03
This is where I'm struggling. I have used the InStr and InStrRev to find out title and last name, I don't have the knowledge (I don't think) to find the bit inbetween.

plasticpig said:
So how does "Captain Leone Sextus Denys Oswolf Fraudatifilius Tollemache-Tollemache de Orellana Plantagenet Tollemache-Tollemache" fit in to your schemes of splitting names up? Without loads of programming you will not get 100% accuracy because people have double-barreled surnames with spaces in them and then you have surnames like de Souza. It is not a trivial task.





Captain Leone Sextus Denys Oswolf Fraudatifilius Tollemache-Tollemache de Orellana Plantagenet Tollemache-Tollemache is a real persons name
hehe

I appreciate it's a real name, however out of the 25,000 records I have, I think he is unlikely to pop up!

I see what your'e getting at though. But by finding the first "word" and the last "word", I'm hoping I could find out whatever the middle bit is. Even if that is our gentlemanly Captain Leone Sextus Denys Oswolf Fraudatifilius Tollemache-Tollemache de Orellana Plantagenet Tollemache-Tollemache is a real persons name

Edited by Mikeyplum on Wednesday 17th October 16:26


Edited by Mikeyplum on Wednesday 17th October 16:48

Du1point8

21,618 posts

194 months

Wednesday 17th October 2012
quotequote all
If you are good at access I can show you how its done in C# and you can apply similar logic in Access??? Im access is very very rusty.

Due the way that the information appears in yours its very difficult to code in Access, although it looks like you need to use split function.

For example this below:

Const TestString As String = "MR&MRS JOHN & JANE DOUGH"

Dim strText As String
Dim strParts() As String
Dim intCounter As Integer

strText = Replace(strText, " & ", "&")

strParts = Split(strText, "&")

For intCounter = LBound(strParts()) To UBound(strParts())
MsgBox strParts(intCounter)
Next intCounter

Falls over on the names so would only work if it was MR JOHN DOUGH (using " " instead of "&").

So if you have a little more knowledge you could use something similar to the above for it?

mrmr96

13,736 posts

206 months

Wednesday 17th October 2012
quotequote all
Mikeyplum said:
My inferior brain has almost entirely melted through my ears. banghead

I have a Field in Access called FULL NAME, which consists of complete names:

e.g. MR&MRS JOHN J & JANE J DOUGH

I have already managed to split the Title and the Last Name into seperate fields using:

Title: Left([Debt By Rating Area]![c_FullName],InStr(1,[Debt By Rating Area]![c_FullName]," ")-1)

Last Name: Right([Debt By Rating Area]![c_FullName],Len([Debt By Rating Area]![c_FullName])-InStrRev([Debt By Rating Area]![c_FullName]," "))

Does anyone know how I can get everything in the middle into a seperate field (i,e the JOHN J & JANE J) bit.

FYI - not all the data will be the same. There may be names such as MR J JONES etc.

Cheers!
slowly gathers melted brain from keyboard
I can do this in excel, so assume the logic is the same.

You basically need to use a MID function, which takes 3 paramters:
=MID(String,Start,Length)
String = the original string
Start = the char number to start at
Length = the number of chars to return

So to get the "middle bit" all you need to know is where to start, and how many chars to return.
Finding the starting point is dead easy, you use INSTR to get the position of the first " ".
Finding the number of chars to return is only ever so slightly harder. You use INSTRREV (or whatever that's called) to find the number of chars from the end where the LAST " " occurs. You then take the lenght of the entire string and subtract the position of the first " " and subtract the number of chars before the end that the last " " is and then that gives you the lenght of the middle bit.

For example, if the string was "MR&MRS JOHN J & JANE J DOUGH" then you'd do this:


MR&MRS JOHN J & JANE J DOUGH
.........1111111111222222222
1234567890123456789012345678

Start point is 7+1 = 8
Lenght = Total Lenght - Position of first space - Position from end of last space
...... = 28 - 7 - 6 = 15
So you'd start at position 8, and the following 15 chars.


Liszt

4,330 posts

272 months

Wednesday 17th October 2012
quotequote all
In my view you need to establish some rules first.

Run a query to identify all the entries which are not "title firstname lastname" do a count of all the spaces and identify which are more than 2.

Then you can look at all those that contain & to work out how they are different.

May be worth doing a global replace of " and " with "&".

Once you have established the rules then you can sort it programatically.

Clean your data first and you will have less rules.

plasticpig

12,932 posts

227 months

Wednesday 17th October 2012
quotequote all
The proper way of cleaning the data is to obtain the Surname and Forename data from the Office of National Statistics (it's part of their census data). Then you can identify all the surnames by matching them against the surname data. You can do the same thing with forename's as well. This isn't available online and is probably considered to be of commercial value so probably isn't available via a FOIA request. There are some smaller free data sets about but the census data is the most reliable.


















Mikeyplum

Original Poster:

1,646 posts

171 months

Thursday 18th October 2012
quotequote all
mrmr96 said:
I can do this in excel, so assume the logic is the same.

You basically need to use a MID function, which takes 3 paramters:
=MID(String,Start,Length)
String = the original string
Start = the char number to start at
Length = the number of chars to return

So to get the "middle bit" all you need to know is where to start, and how many chars to return.
Finding the starting point is dead easy, you use INSTR to get the position of the first " ".
Finding the number of chars to return is only ever so slightly harder. You use INSTRREV (or whatever that's called) to find the number of chars from the end where the LAST " " occurs. You then take the lenght of the entire string and subtract the position of the first " " and subtract the number of chars before the end that the last " " is and then that gives you the lenght of the middle bit.

For example, if the string was "MR&MRS JOHN J & JANE J DOUGH" then you'd do this:


MR&MRS JOHN J & JANE J DOUGH
.........1111111111222222222
1234567890123456789012345678

Start point is 7+1 = 8
Lenght = Total Lenght - Position of first space - Position from end of last space
...... = 28 - 7 - 6 = 15
So you'd start at position 8, and the following 15 chars.
Cheers mr...

Any idea how that would look in an access query?

My first thoughts were:

Mid([Debt By Rating Area]![c_FullName],InStr([Debt By Rating Area]![c_FullName]," "),Len([Debt By Rating Area]![c_FullName])-InStrRev([Debt By Rating Area]![c_FullName]," "))

It started in the right place, but cut the name short...

Mikeyplum

Original Poster:

1,646 posts

171 months

Thursday 18th October 2012
quotequote all
Cracked it! woohoosilly

The Query was:

Mid([Debt By Rating Area]![c_FullName],InStr(1,[Debt By Rating Area]![c_FullName]," "),Len([Debt By Rating Area]![c_FullName])-InStr([Debt By Rating Area]![c_FullName]," ")+1-(Len([Debt By Rating Area]![c_FullName])-InStrRev([Debt By Rating Area]![c_FullName]," ")+1))

Which is pretty much what mrmr2 said:

Total Length - Position of first space - (Total Length - Position of last space)

Now I can go home and drink myself into a stuper to regain some brain cells! biggrin

Thanks for all your help guys! thumbup

mrmr96

13,736 posts

206 months

Thursday 18th October 2012
quotequote all
Well done. Told you it was easy! wink

mrmr96

13,736 posts

206 months

Thursday 18th October 2012
quotequote all
Mikeyplum said:
Which is pretty much what mrmr2 said:

Total Length - Position of first space - (Total Length - Position of last space)
You can also simplify this as
Total Length - Position of first space - (Total Length - Position of last space) = Position of last space - Position of first space

wink

Mikeyplum

Original Poster:

1,646 posts

171 months

Friday 19th October 2012
quotequote all
mrmr96 said:
You can also simplify this as
Total Length - Position of first space - (Total Length - Position of last space) = Position of last space - Position of first space

wink
Thats what I tried doing, as I though InStrRev gave the position of the last space from the right. But it doesn't, it gived the position of the last space from the left. Hence the added sum at the end.

e.g.

MR DAVID JONES

InStr woulf result in 3
InStrRev would result in 9

PITFA!! hehe