Thank you for being a valued part of the CNET community. As of December 1, 2020, the forums are in read-only format. In early 2021, CNET Forums will no longer be available. We are grateful for the participation and advice you have provided to one another over the years.

Thanks,

CNET Support

General discussion

Excel Arrays

Nov 24, 2004 9:25AM PST

I am having trouble with a large (6000 record) spreadsheet (Excel 2000)
I want to find an average based on the info in ten fields. However, not all the fields have info so I have been using a count based upon an "COUNTIF" . For a single record, this works fine. However, if I try to calculate the figure for a number of records using an array, the info I get is the total count for ALL the records not the right count for each line.

The reason for this seems to be my inability to copr with the number of "colons" around.

The formula I'm using is = COUNTIF(A1:JI,">0")
putting the count in K1
But if I try to Array for say 10 lines using all sorts of ways using eg =((A1:A10)SadJ1:J10),">0") it puts the same answer in K1 to K10 which is the count of All the 100 fields in the square.

Please help before I go bananas!!!!!

Tony

Discussion is locked

- Collapse -
Re: Excel Arrays
Nov 24, 2004 2:22PM PST

Until which column and line is your spreadsheet? I have made a macro that covers only A1 to J10. If you could be more specific then we could adjust it to cover it that way. Anyway here is the macro I wrote. Try for a sample spreadsheet with numbers within cells from A1 to J10:

Dim cnt As Integer
cnt = 0
For i = 1 To 10
Range("A" & i).Select
For n = 1 To 10
If Not IsEmpty(ActiveCell) Then
If IsNumeric(ActiveCell) Then
cnt = cnt + ActiveCell.Value
End If
End If
ActiveCell.Offset(0, 1).Select
Next
Next
Range("K1").Select
ActiveCell.Value = cnt

Swisse

- Collapse -
Re: Excel Arrays
Nov 25, 2004 4:43AM PST

Swisse


Thanks for your help

Currently the spread covers H2:H249 & Q2:Q249 with the answer going in R2:R249.

However...................

You are assuming an awareness of Macros which, to put it crudely, "just aint there"

I've attempted to create a Macro - I'm not stupid really - but, as with all programs, there's a problem with headers, punctuation etc. so it's going to have to be absolutely ABC .

I'm frantically reading by 1000 page book but, it's a bit slow

Once again - Thanks

Tony

- Collapse -
Re: Excel Arrays
Nov 25, 2004 5:49AM PST

You're welcome. Take your time.

Swisse

- Collapse -
Re: Excel Arrays
Nov 25, 2004 7:41AM PST

Hi again

Right - Using your macro

"i" is the number of Horizontal Fields (the A to J )
"n" is the number of vertical fields ( in mine the 240 or so )
Range ("K1")is the first position of the calculated field

Right so far ?

First problem I see is although the horizontal format is fixed, new records enter as and when. Do I just change the n figure?

Not good news

Ist example gives me a count of 0 (should be 2) on first line and nothing thereafter.

I'm using a book on Excel 5 - yes I'm that old- but you could probably tell that by the fact that my apostrophes are in the right place .

My macro looks the same as theirs BUT .....
Does a highlight in yellow mean a problem?

Tony

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:04AM PST

Hi Tony. Hope you don't mind me butting in.

I'm like you, I don't really understand writing macros, although I can get away with recording a macro in Excel's Tools > Macros > Record New Macro option. But writing one from scratch is not for me.

I'm not sure what the problem is you are describing. You say you have a 6000 cell record, but want to find an average based on the info in 10 fields. What do you mean "based on the info in 10 fields"? Do you mean that you want to average H2:H249, but only the cells in the range that have a particular "marker" in any or all of the 10 information fields? And then do the same for the range Q2:Q249?

If so, this seems a pretty complicated set of rules for averaging something. What I mean is, it sounds like you want to average cells H2 though to H239, but only including the Hxxx cell in the average if x applies, and/or only if y applies, and/or only if z applies, and/or only if x1 applies, and/or if y1 applies, etc for all the 10 conditions.

What happens if all of the conditions apply, or some apply and others do not, or none of them apply? How does this affect the contents of individual cells in the H range? Or have I got this wrong?

I tend to use flags which is a bit like your 10 conditions, eg cell A1 will hold a number 7 if the flag in cell A10 is set, or the number 52 if the flag in cell A10 is not set. The "flag" cell would be set if it had a 1 in it, and not set if it had a 0 in it. So my formula in cell A1 would read (Without the " " marks");

"=IF(A10=0,52,(IF(A10=1,7,0)))"

Explanation: There are 3 sets of brackets.( ( ( ) ) ). Working from the Outside set first,
if A10=0, then A1 = 52.

If not, then (middle set)
if A10 = 1, then A1 = 7.

If no, then (inner set)
if A10 = neither 0 nor 1, A1 = 0.

If you can follow that, then you will see that just 2 conditions of 1 flag cell causes the target cell to have 3 possible values. You could enlarge the IF statement to allow for other conditions in other flag cells, but the formula gets very complicated indeed, because there could be so many different combinations of possibilities, (with 10 conditions there would be 10^10 different conditions). Also, Excel limits 7 IF statements in any one formula.

So, if you really do have a cell or cells whose contents depend upon 10 conditions, I see no easy way out of your problem.

Let me know if what you have is different to what I think you have.

Mark

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:22AM PST

Hi Mark

What I have is a 6000 record spreadsheet of about 28 columns. In those, are ten columns of dates running from latest to earliest. Thus the "frequency" is latest - earliest divided by occurence. So I have created a "dummy" of latest-earliest and all I need to do now is a dummy of occurence. The trouble is that it might not be ten so I need to count how many there actually are. On a line by line basis, the way I described it works a treat ie using COUNTIF. But trying to link all 6000 lines, you need an array formula so that each line is calculated separately.

BUT I can't get an array that works. All that I can do on a test bed of 10 records is to get it to add up ALL the occurences and then it puts that figure as the answer to each line.

Hence the communications re Macros

Does this make sense?

Tony

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:48AM PST

Hmmm

It does sound very complicated.

A question;

Does each row have 10 columns of cells which may or may not have dates in them?

Also, when you talk about "frequency", are you not really talking about a frequency from the earliest date to the latest date? What I mean is, if the first occurance was on Nov 1 2004, and the latest occurrence was on Nov 24 2004, then the frequency is the number of cells in that row of 10 that has a date in, divided by 24.

eg, something happened on 11/1, 11/2, 11/5, 11/14, 11/20, and 11/24. So, in the 24 day period 11/1 to 11/24, an occurrence happened 6 times, or 6/24 times per day, (on average).

To do that, you would use the COUNTBLANK function to count how many of the cells in that row were empty, then take that away from 24, then divide the answer by 24.

Then do that for each row.

(a chosen cell in the row)=COUNTBLANK(Q1:Z1) (if the Q to Z columns held the dates).

To get the number of days in the period between the earliest and latest date you would use;

(If Q1 held the earliest date and Z1 held the latest date),

(another chosen cell in the row)=Z1-Q1+1
This cell would have to be formatted to show as a number, and the "+1" is because the subtraction without it only shows the number of days from, but not including the first date.

However, I fear that this also is not what you mean, Happy

Mark

- Collapse -
Re: Excel Arrays
Nov 25, 2004 5:15PM PST

Nearly there !!!!!!!

Forget the Excel.

A man buys a pizza on the 1st April, 1st May, and 1st June. Today is 1st August. Frequency factor is wrong description. What I really mean is the average interval for buying pizzas. So he has bought 3 (Count of fields) and the time between Earliest and latest is a couple of months - 61 days . Therefore, he buys on average a pizza every 30.5 days. ( I've only just seen that I need to count the fields and deduct 1)

Next part is to see how long since purchase. Todays date - Date of last purchase gives 61 days. Therefore as we (you?) say " Houston, we have a problem"

The complications you quite rightly mention of adding 1 when messing about with dates are not really significant. I'm going to try again this evening (UK time) and see how we get on. I've got so many test spreadsheets, if I get the right answer, I'll probably never find it again.

Thanks for your interest

Tony

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:03PM PST

Hi again Tony,

This is becoming very interesting, Happy

And, it seems you're from the UK. Me too so that makes date handling easier for us two. I used USA date formats, mm/dd/yy because I thought you were in the US, but I prefer our own date handlng, ie, dd/mm/yy. I will do that from now on, (sorry to our US colleagues).

Kees is right of course, this is more a database question and Access would perhaps be much better in handling this type of project. But I never could get a handle on databases. I learned on an old DOS program called DataEase, but it was a beast to understand. It didn't even use the Enter, (Return), key properly, but you had to enter data or programming using the F2 key. Weird!! So, that sort of put me off databases, and if you're the same, and if you are having difficulty with Kees' method, (he would not have had the use of your further description of the project when he posted his reply), try this;

A man buys a pizza on 1/4, 1/5, and 1/6. Today is 1/8. Lets change this to 1/8, 1/9, 1/10 and today's date is 26/11, (you cannot have today's date as 1/8, because it isn't. It is 26/11 or whatever date you are reading this. You would have to change the Windows date which I wouldn't advise).

We do not count today, because the day has not yet ended and he may buy one later tonight. So, the interval is 25/11 - 1/8. In Excel this is (cell)=Today's date - (cell containing 1/Cool

Today's date is =NOW(). If you typed this into a spare cell, you would get the date and time, and to remove the time, (which I imagine you don't want), you would format the cell to show a date only. But, if you used NOW() as part of a larger formula, you would just format the cell as a number, then use the INT function to remove any decimal of a day, eg cell=INT(Now()-(cell containing 1/8)).

In my example, from 1/8 to Now, (26/11), the formula =INT(Now()-(cell containing 1/Cool gives 117 days.

So the period is 117 days. But you now have to count up how many days he brought a pizza on. Again you could use the COUNTBLANK function. In the row of dates, count up in the range how many blank cells there are, ie cells for dates where he did not buy a pizza. Take that away from 117, and you know he brought pizza's on x number of days. (Let's use x for now).

So, assuming a row of cells in the range D10 to JL10, (or whatever column is the 365th column for 365 days in the year). Why? What if he brought a pizza every day? Sad, perhaps, but you may have to consider the possiblilty, Happy

Now I've thought about this. You cannot get more columns than column IV, (letter I, letter V), which is only 270 or so days. I believe that you "should" consider the possibility that he buys one every day. In that case, using rows is not the best method. The best method is using columns, eg a column of cells in the range D10 to D375. Why start at D10? So you can have page titles and column headers above.

Each cell in the range is a flag. If it is empty, he did not buy a pizza. If it has a one in it, (or any other character of your choice), he did. So, on each day he brought a pizza, you enter a 1 in that date cell.

For how many days did he buy a pizza? That is =365-COUNTBLANK(D10 : D375), or x number of days, using the notation x above. (I had to put spaces in between the D10 colon D375, because I got that pesky smiley in between!).

How many days in the period? That depends on the date that D10 represents. If D10 represents 1/1/04, then the number of days is =INT(NOW()-D10), formatted as a number. Let's use y days.

There is a problem using 2004 as the year. It is a leap year, (ie 29/2/04), so the range would have to be D10 to D376.

So, he brought x pizza's in a total of y days. He buys a pizza on average every y/x days. Use =INT(y-x), or =INT(Cell containing y - Cell containing x). Format the cell as a number to 2 decimal places.

There you have it. Bulky and long in the column range, but you can adapt this perhaps.

What problems might you find?

Well, one is the length of the page. You may want to show your results on another sheet.
Also, what if you want the spreadsheet to show how many other things he brought; eg he may not buy a pizza, but buy a pie and chips on some days. Or, on the days he did buy a pizza, he brought a Coke on some days and a soda pop, (fizzy drink to us in the UK), on others, or nothing else on still others. Perhaps COUNTIF will help there, but there could be problems using the COUNTBLANK.
More than one customer? I hope so, but how many? 270? that's how many columns you could have on one sheet.

Let me know if any of this is useful, or if not what problems you encounter.

Mark

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:47PM PST

Mark

This is getting more interesting.

From a marketing point of view, there's no real importance in anything other than the last 10 purchases.

Lets forget MS and go back to what this is all about.
I want to define a customer base by the following

a) Geography
b) Frequency of latest purchase history
c) Actual current behaviour

a) is simple - done on street names and then given a classification

b) involves entering sales as they occur - most recent on left and then shifting the data accross from left to right using cut and paste. After 10, last one drops off

c) involves the comparison between the "last date of purchase in relationship to the current date" in relationship to the expectation of when he should have bought. This is stop unecessary follow-up; if a customer only buys once every six months, then why bother until at least six months have gone. However if it's six months since he's bought and he buys every month then, unless he's moved - problems.


That's the background; obviously there's a bit more to it manually but that's the gist - actual against predicted.

So going back to Excel - Don't even think about A*****s - the solutions all work on the one line. They probably all would work on the entire spreadsheet if only I knew how to do it.

I'm used to converting simple formulas to arrays but when you have a range already in the formula, whether its Min/Max/Countif or whatever , I can't get it to extrapolate the formula for the whole column. Yes it does something - treats the range as a rectangular block and adds up rows and columns alike.

Hence my question
Anyway to all of you, my thanks. I assumed you were all Transatlantic residents because of the time we were working - it looks as though you lot get no sleep either

Tony

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:07AM PST

Ok Tony, I have to clarify two things:
1. The i is the counter of the vertical fields, and n is the counter of the horizontal ones.
2. I'm getting a little mixed up. Please rephrase your problem once again. Are you trying to count the cells from A1 to A200+ and from J1 to J200+ which contain numbers, specifically more than zero, and putting the results in K1 to K200+? Please explain clearly.

Swisse

- Collapse -
Re: Excel Arrays
Nov 25, 2004 8:31AM PST

No

Horizontal line is the record for person A

On that line might be 10 dates (or any number between 1 and 10). I need to know how many dates per horizontal line there are. This is the count of occurences.

I also need to know the interval between the Latest date and the earliest date. This will give me a period in days which, when I divide it by the "count of occurences" will give me an average interval .

Then I will be able to sort all the people to give me a ranking list of "who uses most"

Does that help?

Tony

- Collapse -
Re: Excel Arrays
Nov 25, 2004 4:53PM PST

Tony,

Once again, swisse and I seem to have different ideas. He's more a macro-person, I'm more of a formula-person. Which means you've got something to choose. I hope we all agree that's a good thing.

But it's clear now what you want. That most certainly helps.

On a line (in a range), the number of fields filled in a range is a count. The minimum is a min, the maximum a max, so the average interval is (max - min) / (count - 1). Say there are 2: count = 2, max = 11/25, min = 11/23, than average interval = 2 / 1 = 2. Agreed? You'll have to take into account the special cases with count = 0 (empty row) and count = 1 (1 date, no interval), and possibly min=max (interval = 0), so you need a few nested if's, but all can be done with one formula.

Put your formula in a separate column, use Data>Sort to sort on it, and that's the whole story.

In fact, I'm even more a database person. In my view, problems like this should be attacked with a program like MS Access, not with a spreadsheet. But that's a different story.

Hope this helps.

Kees

- Collapse -
Re: Excel Arrays
Nov 26, 2004 1:39AM PST

I think I may have found the solution to your problem. I hope I didn't misunderstand you this time. Ok, I have a spreadsheet from A1 to J10, within which can be found some dates. The macro below starts from A1 and goes from cell to cell until J1 looking for cells with dates within them. If it finds one it stores it in tmp1. If it finds another it compares it with tmp1. If tmp1 is greater than tmp2 then it stores it in tmp2, which should by now be greater than tmp1. After it searches the ten columns(A-J) it skips one column, the K column, and goes to column L writes on the cell the content of tmp2 which is the latest date. It then goes to the next column, the M column, and writes the earliest date. It then goes to the next column, the N column, and writes the interval or difference between the latest and earliest dates. On the next column, the O column, it writes the average of the interval/difference: average = difference/occurences. It then goes downwards and does that again to the other lines.
There is only one requirement for this macro to function properly:
1. The latest date should be on the first columns, and the latest ones on the last ones.

So here again the revised code, and put it inside a procedure:

Application.ScreenUpdating = False
On Error Resume Next
Dim cnt As Integer
Dim diff As Integer
cnt = 0
For i = 1 To 10
Range("A" & i).Select
For n = 1 To 10
If Not IsEmpty(ActiveCell) And IsDate(ActiveCell) Then
cnt = cnt + 1
tmp1 = ActiveCell.Value
If tmp1 > tmp2 Then
tmp2 = tmp1 'tmp2 holds latest date
End If 'tmp1 holds earliest date
End If
ActiveCell.Offset(0, 1).Select
Next
ActiveCell.Offset(0, 1).Select
If tmp2 = vbNull Then
ActiveCell.Value = ""
Else
ActiveCell.Value = tmp2
End If
ActiveCell.Offset(0, 1).Select
If tmp1 = vbNull Then
ActiveCell.Value = ""
Else
ActiveCell.Value = tmp1
End If
ActiveCell.Offset(0, 1).Select
diff = DateDiff("d", tmp1, tmp2)
If diff = 0 Then
ActiveCell.Value = vbNullString
Else
ActiveCell.Value = diff
End If
ActiveCell.Offset(0, 1).Select
If diff = 0 Or cnt = 0 Then
ActiveCell.Value = vbNullString
Else
ActiveCell.Value = diff / cnt
End If
cnt = 0
diff = 0
tmp1 = vbNull
tmp2 = vbNull
Next
Application.ScreenUpdating = True

Swisse

- Collapse -
Re: Excel Arrays
Nov 26, 2004 4:58AM PST

Well **** me

IT WORKS.

Being totally ungrateful, it would be easier for entry if I could go from Latest to Earliest but sod that.

Thanks a bundle

(and that goes to all of you)

Mind you, the spreadsheet's not finished yet!!!!!!!!


Tony

- Collapse -
(NT) (NT) You're welcome.
Nov 26, 2004 5:32AM PST
- Collapse -
Re: (NT) You're welcome.
Nov 26, 2004 6:53AM PST

Good one! Happy