Showing posts with label modeling. Show all posts
Showing posts with label modeling. Show all posts

Wednesday, March 28, 2012

modeling age

I'm a newcomer to AS, but familiar with traditional SQL/relational stuff. I'm getting used to dimensional modeling techniques, and have come across a question. We have a situation where users can specify age in different units, for example if the age is 10 months, the the age is marked as 10, the age "class" is months. Or it may be 10 for age, and the "class" is years. The classes can range from minutes --> hours --> up to years. The age quantity is dependant on the value provided for the age class. For example, if they provide the "months" class, then the age must fall between 1 & 12.

I'm wondering what the best way to model this both in the dimensional relational database that will feed AS, and is there any AS tools (calculations/hierarchies, etc.) that would make this easier. And if so - can someone point me in the right direction.

I can think of two possible options.

1) store the data in one single format and then use calculations to display the different classes.

2) create a different measure for each class type.

I would not mix different units of data in the one measure. It means you cannot aggregate the data and will lead to lots of head aches down the track. Option 1) might loose a little bit of accuracy if you need to span time frames between years and seconds, option 2) would use more storage space, but you could keep the orginal fidelity of the data.

|||

Darren - thanks for the response. A follow up question....don't know much about measure groups (as you may see in my question) - but in our situation the age is not the measure, so sure not your suggestion will do. For instance our users aren't querying average age by City. They're querying # of events by age. Isn't age then a dimension (or part of one?). Then how do you do a measure group or measure on age?

I guess I'm a little lost when it comes to measure groups. Actually I'm a little lost when it comes to anything that ventures outside of the wizards in AS.

|||

If you want to filter and slice other measures by age, then you are right, you would model it as an attribute, not as a measure. I think this makes it even more important to normalize the ages. I am thinking that if one user enters "120 minutes" and another "2 hours", then you want both of these ages to come up as being equivalent.

It probably depends more on what type of analysis your users will be doing. Often in this type of situation you would set-up an age dimension with some useful breakdowns. You can then do your breakdowns at any granularity that makes sense and group things together into hierarchies.

eg. Your could even do something like the following

AgeId

Age Group1 Group2 1 0-30 Minutes 1 Day Upto 1 month 2 31-60 Minutes 1 Day Upto 1 month 3 1-2 Hours 1 Day Upto 1 month 4 3-6 Hours 1 Day Upto 1 month 5 7-12 Hours 1 Day Upto 1 month 6 13-24 Hours 1 Day Upto 1 month 7 1-2 Days 2-10 Days Upto 1 month 8 3-5 Days 2-10 Days Upto 1 month 9 5-10 Days 2-10 Days Upto 1 month 10 11-20 days 11-30 Days Upto 1 month 11 21-30 Days 11-30 Days Upto 1 month 12 31-60 Days 2-3 Months 2 months to 1 year 13 61-90 Days 2-3 Months 2 months to 1 year 14 4-6 Months 4-12 Months 2 months to 1 year 15 7-12 months 4-12 Months 2 months to 1 year 16 1-2 Years More than 1 year More than 1 year 17

3-5 Years More than 1 year More than 1 year

Then as you populated your fact table you would insert the appropriate AgeID into it.

modeling age

I'm a newcomer to AS, but familiar with traditional SQL/relational stuff. I'm getting used to dimensional modeling techniques, and have come across a question. We have a situation where users can specify age in different units, for example if the age is 10 months, the the age is marked as 10, the age "class" is months. Or it may be 10 for age, and the "class" is years. The classes can range from minutes --> hours --> up to years. The age quantity is dependant on the value provided for the age class. For example, if they provide the "months" class, then the age must fall between 1 & 12.

I'm wondering what the best way to model this both in the dimensional relational database that will feed AS, and is there any AS tools (calculations/hierarchies, etc.) that would make this easier. And if so - can someone point me in the right direction.

I can think of two possible options.

1) store the data in one single format and then use calculations to display the different classes.

2) create a different measure for each class type.

I would not mix different units of data in the one measure. It means you cannot aggregate the data and will lead to lots of head aches down the track. Option 1) might loose a little bit of accuracy if you need to span time frames between years and seconds, option 2) would use more storage space, but you could keep the orginal fidelity of the data.

|||

Darren - thanks for the response. A follow up question....don't know much about measure groups (as you may see in my question) - but in our situation the age is not the measure, so sure not your suggestion will do. For instance our users aren't querying average age by City. They're querying # of events by age. Isn't age then a dimension (or part of one?). Then how do you do a measure group or measure on age?

I guess I'm a little lost when it comes to measure groups. Actually I'm a little lost when it comes to anything that ventures outside of the wizards in AS.

|||

If you want to filter and slice other measures by age, then you are right, you would model it as an attribute, not as a measure. I think this makes it even more important to normalize the ages. I am thinking that if one user enters "120 minutes" and another "2 hours", then you want both of these ages to come up as being equivalent.

It probably depends more on what type of analysis your users will be doing. Often in this type of situation you would set-up an age dimension with some useful breakdowns. You can then do your breakdowns at any granularity that makes sense and group things together into hierarchies.

eg. Your could even do something like the following

AgeId

Age Group1 Group2 1 0-30 Minutes 1 Day Upto 1 month 2 31-60 Minutes 1 Day Upto 1 month 3 1-2 Hours 1 Day Upto 1 month 4 3-6 Hours 1 Day Upto 1 month 5 7-12 Hours 1 Day Upto 1 month 6 13-24 Hours 1 Day Upto 1 month 7 1-2 Days 2-10 Days Upto 1 month 8 3-5 Days 2-10 Days Upto 1 month 9 5-10 Days 2-10 Days Upto 1 month 10 11-20 days 11-30 Days Upto 1 month 11 21-30 Days 11-30 Days Upto 1 month 12 31-60 Days 2-3 Months 2 months to 1 year 13 61-90 Days 2-3 Months 2 months to 1 year 14 4-6 Months 4-12 Months 2 months to 1 year 15 7-12 months 4-12 Months 2 months to 1 year 16 1-2 Years More than 1 year More than 1 year 17

3-5 Years More than 1 year More than 1 year

Then as you populated your fact table you would insert the appropriate AgeID into it.

modeling a table with FK

I am having a problem when modeling a Foreign Key in an "Operations" table. This table holds all information on customers ′s applications and withdrawals.

Here is the structure:

CustomerID int, SourceID int, Value decimal (16,2), OperationDate datetime

Well the problem is that SourceID sometimes might be NULL depending on how the record was inserted. So its kind of cumbersome to define it as an FK, since it can be null...To get things worse, this SourceID might point to more than 1 table (depending on the CustomerType it will point to SourceA table or SourceB table)...

How should this be modeled?

What is sourceId? It sounds like you need to do a bit more normalization here and have a source table.

source
sourceId int
customerType
<other source bits>

Then your other tables like SourceA and SourceB reference this table, as well as the operations table:

create table operations
(
...I assume you have a key other than these columns,
customerId int
sourceId int null references source(sourceId)
)

Then the source will either be sourceA or sourceB depending on the customerType (assuming you are defining the source for a given customerType.) I am not sure that I am making sense, but this sounds like what I am getting from your post. If not, can you give your current table structures and a bit more about usage?

Modeling a Matrix

What is the best way to model a matrix in terms of table design? I want to
model a sociogram, which is basically a network diagram with weights
assigned to the edges. Once, established I would want to use it to do matrix
computations (matrix algebra). So, the simple version would be something
like the following:
0 1 1
1 0 1
1 1 0
The rows and columns will constitute the same set of data. In other words,
the matrix will reflect the relationship between like entities from the same
set. Relationships between persons of a given set of people, for example.
Just wondering if there is a clever way of modeling this, or if I should
just use a table with three columns: entityA, entityB, edgeValue...
Thanks
BKGet a copy of SQL FOR SMARTIES; there is a whole chapter on matrix math
in SQL.
CREATE TABLE Martix
(i INTEGER NOT NULL CHECK (i BETWEEN 1 AND <<n1>>,
j INTEGER NOT NULL CHECK (i BETWEEN 1 AND <<n2>>,
k INTEGER NOT NULL CHECK (i BETWEEN 1 AND <<n3>>,
element_value FLOAT NOT NULL,
PRIMARY KEY (i, j, k));
Then you talk about graphs in SQL with a sociogram -- which is it?|||(grabs SQL FOR SMARTIES off of his bookshelf...and there it is on p303! ..
should have checked there first)
Excellent, thanks! I thought that might be the best way to do, but just
wanted to validate my gut feeling...
The answer to your question is: both. Really its just a matrix
representation of a sociogram, so the edges will represent the relationships
between the nodes (and the elements of the matrix). The row and column
vectors will be the same and will represent the "people".
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1115680013.666216.159290@.f14g2000cwb.googlegroups.com...
> Get a copy of SQL FOR SMARTIES; there is a whole chapter on matrix math
> in SQL.
> CREATE TABLE Martix
> (i INTEGER NOT NULL CHECK (i BETWEEN 1 AND <<n1>>,
> j INTEGER NOT NULL CHECK (i BETWEEN 1 AND <<n2>>,
> k INTEGER NOT NULL CHECK (i BETWEEN 1 AND <<n3>>,
> element_value FLOAT NOT NULL,
> PRIMARY KEY (i, j, k));
> Then you talk about graphs in SQL with a sociogram -- which is it?
>|||BK,
See http://groups.google.co.uk/groups?q=BBEB95_9DBB7E for an example.
Steve Kass
Drew University
BK wrote:

>What is the best way to model a matrix in terms of table design? I want to
>model a sociogram, which is basically a network diagram with weights
>assigned to the edges. Once, established I would want to use it to do matri
x
>computations (matrix algebra). So, the simple version would be something
>like the following:
>0 1 1
>1 0 1
>1 1 0
>The rows and columns will constitute the same set of data. In other words,
>the matrix will reflect the relationship between like entities from the sam
e
>set. Relationships between persons of a given set of people, for example.
>Just wondering if there is a clever way of modeling this, or if I should
>just use a table with three columns: entityA, entityB, edgeValue...
>Thanks
>BK
>
>