Wednesday, March 28, 2012

modeling age

I'm a newcomer to AS, but familiar with traditional SQL/relational stuff. I'm getting used to dimensional modeling techniques, and have come across a question. We have a situation where users can specify age in different units, for example if the age is 10 months, the the age is marked as 10, the age "class" is months. Or it may be 10 for age, and the "class" is years. The classes can range from minutes --> hours --> up to years. The age quantity is dependant on the value provided for the age class. For example, if they provide the "months" class, then the age must fall between 1 & 12.

I'm wondering what the best way to model this both in the dimensional relational database that will feed AS, and is there any AS tools (calculations/hierarchies, etc.) that would make this easier. And if so - can someone point me in the right direction.

I can think of two possible options.

1) store the data in one single format and then use calculations to display the different classes.

2) create a different measure for each class type.

I would not mix different units of data in the one measure. It means you cannot aggregate the data and will lead to lots of head aches down the track. Option 1) might loose a little bit of accuracy if you need to span time frames between years and seconds, option 2) would use more storage space, but you could keep the orginal fidelity of the data.

|||

Darren - thanks for the response. A follow up question....don't know much about measure groups (as you may see in my question) - but in our situation the age is not the measure, so sure not your suggestion will do. For instance our users aren't querying average age by City. They're querying # of events by age. Isn't age then a dimension (or part of one?). Then how do you do a measure group or measure on age?

I guess I'm a little lost when it comes to measure groups. Actually I'm a little lost when it comes to anything that ventures outside of the wizards in AS.

|||

If you want to filter and slice other measures by age, then you are right, you would model it as an attribute, not as a measure. I think this makes it even more important to normalize the ages. I am thinking that if one user enters "120 minutes" and another "2 hours", then you want both of these ages to come up as being equivalent.

It probably depends more on what type of analysis your users will be doing. Often in this type of situation you would set-up an age dimension with some useful breakdowns. You can then do your breakdowns at any granularity that makes sense and group things together into hierarchies.

eg. Your could even do something like the following

AgeId

Age Group1 Group2 1 0-30 Minutes 1 Day Upto 1 month 2 31-60 Minutes 1 Day Upto 1 month 3 1-2 Hours 1 Day Upto 1 month 4 3-6 Hours 1 Day Upto 1 month 5 7-12 Hours 1 Day Upto 1 month 6 13-24 Hours 1 Day Upto 1 month 7 1-2 Days 2-10 Days Upto 1 month 8 3-5 Days 2-10 Days Upto 1 month 9 5-10 Days 2-10 Days Upto 1 month 10 11-20 days 11-30 Days Upto 1 month 11 21-30 Days 11-30 Days Upto 1 month 12 31-60 Days 2-3 Months 2 months to 1 year 13 61-90 Days 2-3 Months 2 months to 1 year 14 4-6 Months 4-12 Months 2 months to 1 year 15 7-12 months 4-12 Months 2 months to 1 year 16 1-2 Years More than 1 year More than 1 year 17

3-5 Years More than 1 year More than 1 year

Then as you populated your fact table you would insert the appropriate AgeID into it.

No comments:

Post a Comment