need help with functions

S

sheela

I have some data recorded for every day for some subjects.
And the possible recorded value on each day is: 0 or 1 or 2 or 3 or 4 or null.
I have this type of data for number of subjects over many months.
My data looks like below.

Subjectname year month day1 day2…………..day30 day31

Name 1990 January 2 0 4 4
Name 1990 February 3 1 null null
Name 1990 March 2 null 3 1
So on.

I need to calculate the following 3 for each row.
1) On which day of the month first occurrence of the 4 comes?

This one I figured out using match() function which I paste below.
=MATCH(4,N2:AR2,0).

2) On which day of the month last occurrence of the 4 comes?
My data can not be sorted, and I couldn’t use match function here.

3) If there are any 4’s in a month, how many consecutive 4’s are there?


Thank you in advance for any help,
sheela.
 
D

Dave Peterson

How about just a reply for #2.

This will return the column number that has the last 4:

=LOOKUP(2,1/(N2:AR2=4),COLUMN(N2:AR2))

so
=LOOKUP(2,1/(N2:AR2=4),COLUMN(N2:AR2))-column(N2)+1
will return the day number
 
R

RagDyer

Try this in AS2:

=INDEX($N$1:$AR$1,MATCH(4,$N2:$AR2,0))

And this in AT2:

=LOOKUP(2,1/($N2:$AR2=4),$N$1:$AR$1)

And copy down as needed.

2 out of 3 !
--
HTH,

RD
==============================================
Please keep all correspondence within the Group, so all may benefit!
==============================================
 
S

sheela

Hello Dave Peterson:

The given formula worked just perfect. But I couldn’t understand how this
function is working.
Why the look up value is is 2? And the what does look up vector 1/(N2:AR2=4)
indicate?
And finally how is it working on unsorted data.
I am sorry I thought I could understand, but seems it is tricky. Thank you
very much for your help.
 
S

sheela

I am sorry; I am still not able to figure out the 3rd question my post.
Could some one please give a hint on this one please?
 
D

Dave Peterson

1/(n2:a42=4)
returns an array of
1's and errors.

lookup(2, that array) will search for the first 2 in that array. Since there is
no 2, it'll find that last 1 and use that. And the last 1 in that array
corresponds to the last cell in n2:Ar2 that equals 4.

Good luck on #3.
 
D

Dave Peterson

1/(n2:a42=4)
should have been
1/(n2:aR2=4)

Dave said:
1/(n2:a42=4)
returns an array of
1's and errors.

lookup(2, that array) will search for the first 2 in that array. Since there is
no 2, it'll find that last 1 and use that. And the last 1 in that array
corresponds to the last cell in n2:Ar2 that equals 4.

Good luck on #3.
 
R

Ron Rosenfeld

I am sorry; I am still not able to figure out the 3rd question my post.
Could some one please give a hint on this one please?

Well, you could always use a User Defined Function (UDF):

<alt-F11> opens the VB Editor.

Ensure your project is highlighted in the project explorer window, then:

Insert/Module

and paste the code below into the window that opens.

To use this function, enter something like:

=Consec(range, number)

where range is the range to be searched, and number the number you wish to test
for maximum consecutives.

e.g.

=Consec(N2:AR2,4)


===============================
Option Explicit

Function Consec(rg As Range, num As Long) As Long
Dim c As Range
Dim t1 As Long, t2 As Long

For Each c In rg
If c.Value = num Then
t1 = t1 + 1
Else
t2 = Application.WorksheetFunction.Max(t2, t1)
t1 = 0
End If
Next c

Consec = Application.WorksheetFunction.Max(t1, t2)

End Function
================================


--ron
 
S

sheela

Hello Ron Rosenfeld:

Thank you very much for your help. It worked out very well.
Now I had to modify some rules to count the consecutive 4’s.
1) if there is just only a single 4 ( no other consecutive 4’s ) any where
in the row we ignore that 4 and treat it as “ non 4â€.
2) If there are less than 10 numbers of “non 4†in between 4 s, we count
them also as consecutive 4.
3) if these consecutive 4’s are ending in the row ( exists on day 30, day
31), then continue counting onto the next row for the consecutive 4’s.


If the condition 3 is not possible could you please help me with the first
2. If the implementation of condition 3 also possible that would be really
great.

I would greatly appreciate any help.

Part of data:
Day1,Day2,Day3,Day28,Day29,Day30,Day31,consec
1,4,3,2,,0,1,0
1,0,0,2,4,4,4,5
4,4,2,0,3,1,0,0

The consec() function values in the last column, which was calculated based
on the given 7 columns.


sheela.
 
R

Ron Rosenfeld

Hello Ron Rosenfeld:

Thank you very much for your help. It worked out very well.
Now I had to modify some rules to count the consecutive 4’s.
1) if there is just only a single 4 ( no other consecutive 4’s ) any where
in the row we ignore that 4 and treat it as “ non 4”.

That's easy. Just add something like: if t2 = 1 then t2 = 0

2) If there are less than 10 numbers of “non 4” in between 4 s, we count
them also as consecutive 4.

That's also relatively easy. Just use another counter.

3) if these consecutive 4’s are ending in the row ( exists on day 30, day
31), then continue counting onto the next row for the consecutive 4’s.

Does this interact with rule 2? Do the last 2 days have to be a 4 in order to
go on to the next row? Or if, for example, Day 25 is a 4 and Day 2 is a 4, and
the initial row is a 31 day month, do we count 9 consecutive 4's?

If the condition 3 is not possible could you please help me with the first
2. If the implementation of condition 3 also possible that would be really
great.

I would greatly appreciate any help.

Part of data:
Day1,Day2,Day3,Day28,Day29,Day30,Day31,consec
1,4,3,2,,0,1,0
1,0,0,2,4,4,4,5
4,4,2,0,3,1,0,0

The consec() function values in the last column, which was calculated based
on the given 7 columns.


sheela.

--ron
 
S

sheela

Yes, you are right. I wasn’t clear.
The rule 2 will continue on from row to the following row.
Now thinking about I just realized, we have to treat all 12 rows (or less in
some cases) in a year as one series of values.
In other words we calculate all the consecutive occurrences of 4 in a year.
The data has 2 other columns with year and month information. I have this
data for many subjects and for many years. It is very messy data.
I hope I stated this clear. I will try to implement the counters for rules 1
and 2 again. I tried yester day, but not working.

Thank you again for your help.
sheela
 
R

Ron Rosenfeld

Yes, you are right. I wasn’t clear.
The rule 2 will continue on from row to the following row.
Now thinking about I just realized, we have to treat all 12 rows (or less in
some cases) in a year as one series of values.
In other words we calculate all the consecutive occurrences of 4 in a year.
The data has 2 other columns with year and month information. I have this
data for many subjects and for many years. It is very messy data.
I hope I stated this clear. I will try to implement the counters for rules 1
and 2 again. I tried yester day, but not working.

Thank you again for your help.
sheela

Treating the entire year as one will actually make things simpler, at least for
me.

It can be implemented in VBA without an add-in, but if you can download and
install Longre's free morefunc.xll add-in, we can do it with worksheet
formulas.

The add-in is available at http://xcell05.free.fr/forums

We can then use the MCONCAT function to create a sequential array of your
values; SUBSTITUTE to change the "nulls" to -'s; and a regular expression to
extract those strings that follow what I think are your rules.

I have made the assumption that a string of ...44121212444... will count as a
string length of 11 4's and that the count will restart after the end of the
"tripled" 4.

I don't know how to count a string that looks like 4411441144 ???

Is it a 6, (441144) then a 2 (44 at the end of the line); or is it a single
match of 10 ??

The following assumes it would count as a 6, then a 2.

In any event, the following **array** entered formula will give you the maximum
run of 4's according to all the above assumptions:

1. When counting a sequence of non-4's that are between two 4's, include this
number in the total count.

2. Only one intervening string of non-fours is allowed.

3. Nulls do NOT count as either fours or as non-fours.

=MAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT(
$D$2:$AH$14),"null","-"),"4+([0-35-9]{1,9})?4+",ROW(
INDIRECT("1:365")))))

To enter an array-formula, hold down <ctrl><shift> while hitting <enter>.

Let me know what you think.
--ron
 
S

sheela

Hello ron:

Thank you very much for the add in and very nice of you for your help.

About my query: we treat “null†value as a non 4.
If there is 21441144114423, we count this as of length 10. Can allow
intervening string of non 4's in either side, but count any 4 just once
(should not include in two concurrent series).
When a count of less than 10 non 4’s occurred between some 4’s we include
those non 4’s also in the count. If the non 4’s count is more than 10, they
will not be included.
For example: the series 2310444432023112442211231232123123 is of length 14.
Another main thing is I need to count the lengths of all concurrent 4
series, not just the maximum. For this the addin function mmax mmin can be
used, with limiting number of maximums upto 24.

Thanks again.


sheela.


Ron Rosenfeld said:
Yes, you are right. I wasn’t clear.
The rule 2 will continue on from row to the following row.
Now thinking about I just realized, we have to treat all 12 rows (or less in
some cases) in a year as one series of values.
In other words we calculate all the consecutive occurrences of 4 in a year.
The data has 2 other columns with year and month information. I have this
data for many subjects and for many years. It is very messy data.
I hope I stated this clear. I will try to implement the counters for rules 1
and 2 again. I tried yester day, but not working.

Thank you again for your help.
sheela

Treating the entire year as one will actually make things simpler, at least for
me.

It can be implemented in VBA without an add-in, but if you can download and
install Longre's free morefunc.xll add-in, we can do it with worksheet
formulas.

The add-in is available at http://xcell05.free.fr/forums

We can then use the MCONCAT function to create a sequential array of your
values; SUBSTITUTE to change the "nulls" to -'s; and a regular expression to
extract those strings that follow what I think are your rules.

I have made the assumption that a string of ...44121212444... will count as a
string length of 11 4's and that the count will restart after the end of the
"tripled" 4.

I don't know how to count a string that looks like 4411441144 ???

Is it a 6, (441144) then a 2 (44 at the end of the line); or is it a single
match of 10 ??

The following assumes it would count as a 6, then a 2.

In any event, the following **array** entered formula will give you the maximum
run of 4's according to all the above assumptions:

1. When counting a sequence of non-4's that are between two 4's, include this
number in the total count.

2. Only one intervening string of non-fours is allowed.

3. Nulls do NOT count as either fours or as non-fours.

=MAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT(
$D$2:$AH$14),"null","-"),"4+([0-35-9]{1,9})?4+",ROW(
INDIRECT("1:365")))))

To enter an array-formula, hold down <ctrl><shift> while hitting <enter>.

Let me know what you think.
--ron
 
R

Ron Rosenfeld

Hello ron:

Thank you very much for the add in and very nice of you for your help.

About my query: we treat “null” value as a non 4.
If there is 21441144114423, we count this as of length 10. Can allow
intervening string of non 4's in either side, but count any 4 just once
(should not include in two concurrent series).
When a count of less than 10 non 4’s occurred between some 4’s we include
those non 4’s also in the count. If the non 4’s count is more than 10, they
will not be included.
For example: the series 2310444432023112442211231232123123 is of length 14.
Another main thing is I need to count the lengths of all concurrent 4
series, not just the maximum. For this the addin function mmax mmin can be
used, with limiting number of maximums upto 24.

Thanks again.


sheela.

OK. let's see if I understand you:

The function:

=REGEX.MID(SUBSTITUTE(MCONCAT($D$2:$AH$14),
"null","0"),"(4+)([^4]{0,9}4+)+",ROWS($1:1))

1. Concatenates the results for the entire year -- 12 rows
2. Substitutes a "0" for "null" to make everything numeric and easier.
3. As written, returns the first string that meets your criteria. If you
copy/drag the formula down, the final term ..ROWS($1:1) will adjust to return
the 2nd, 3rd, etc. term that meets the criteria. This will be useful for
troubleshooting.
4. In an adjacent column you can use the LEN(cell_ref) formula to check the
length of the string.

If the above is giving you the results you want, then the next step would be to
return an array of qualified string lengths, and pull out the 24 highest.

This **array** formula should do that:

=INDEX(MMAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT($D$2:$AH$4),
"null","0"),"(4+)([^4]{0,9}4+)+",ROW(INDIRECT("1:255")))),24),ROWS($1:1))

Enter it in some cell as an array-formula (with <ctrl><shift><enter>) and
copy/drag down 23 cells. It should output the length of each string of 2 or
more fours.

Hope this is getting close to what you need.


--ron
 
R

Ron Rosenfeld

=INDEX(MMAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT($D$2:$AH$4),
"null","0"),"(4+)([^4]{0,9}4+)+",ROW(INDIRECT("1:255")))),24),ROWS($1:1))

One typo and one explanation.

=INDEX(MMAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT($D$2:$AH$14),
"null","0"),"(4+)([^4]{0,9}4+)+",ROW(INDIRECT("1:255")))),24),ROWS($1:1))

I had entered only to Row 4 in the range to concatenate.

There is no need for the second factor in the ROW(INDIRECT(... to be 255 as the
maximum number of 44's in a year can only be, I believe, 366/3 or 122.

I don't know if lowering that number will make any perceptible difference in
speed; it does have to be at least as large as the maximum number of valid
subsequences in your main sequence. I guess it could be as low as 122 for one
year; but should be higher if you analyze a multiyear string.


--ron
 
S

sheela

Ron:

Thank you very much for your help. The mmax function is not working in my
system, it is working like a regular max function, returning just the
maximum. Though I am not getting exactly what I needed, your functions are
very helpful, and very much appreciated.

Sheela.



Ron Rosenfeld said:
=INDEX(MMAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT($D$2:$AH$4),
"null","0"),"(4+)([^4]{0,9}4+)+",ROW(INDIRECT("1:255")))),24),ROWS($1:1))

One typo and one explanation.

=INDEX(MMAX(LEN(REGEX.MID(SUBSTITUTE(MCONCAT($D$2:$AH$14),
"null","0"),"(4+)([^4]{0,9}4+)+",ROW(INDIRECT("1:255")))),24),ROWS($1:1))

I had entered only to Row 4 in the range to concatenate.

There is no need for the second factor in the ROW(INDIRECT(... to be 255 as the
maximum number of 44's in a year can only be, I believe, 366/3 or 122.

I don't know if lowering that number will make any perceptible difference in
speed; it does have to be at least as large as the maximum number of valid
subsequences in your main sequence. I guess it could be as low as 122 for one
year; but should be higher if you analyze a multiyear string.


--ron
 
Top