1 billion rows DW to DM

72 Views Asked by At

I have a design/performance question.

I have this next table.

CREATE TABLE [dbo].[DW_Visits_2016](
    [VisitId] [int] NOT NULL,
    [UserId] [int] NOT NULL,
    [VisitReferrer] [varchar](512) NULL,
    [VisitFirstRequest] [varchar](255) NOT NULL,
    [VisitAppName] [varchar](255) NULL,
    [VisitCountry] [varchar](50) NULL,
    [VisitDate] [smalldatetime] NOT NULL,
    [VisitMins] [int] NOT NULL,
    [VisitHits] [int] NOT NULL,
    [EntryTag] [varchar](100) NOT NULL,
    [VisitCount] [int] NOT NULL,
    [VisitInitialDate] [datetime] NOT NULL,
    [AggregateType] [varchar](50) NULL,
    [MemberId] [int] NULL,
    [ServerName] [varchar](50) NULL,
    [BrowserUserAgent] [varchar](255) NULL,
    [LastModifiedDate] [smalldatetime] NULL,
    [Guid] [uniqueidentifier] NULL,
    [SessionId] [varchar](100) NULL,
    [IPAddress] [varchar](40) NULL,
 CONSTRAINT [PK_Visits] PRIMARY KEY NONCLUSTERED 
(
    [VisitId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)

GO

SET ANSI_PADDING OFF
GO

ALTER TABLE [dbo].[Visits]  WITH CHECK ADD  CONSTRAINT [CK_Visits_VisitDate] CHECK  (([VisitDate]>='2016-01-01' AND [VisitDate]<'2017-01-01'))
GO

ALTER TABLE [dbo].[Visits] CHECK CONSTRAINT [CK_Visits_VisitDate]

And this same table for 2015 ... 2010.

Every table has around 150 million rows. So, combined we are talking about 1,050 million rows.

I received a requirement where BI people wants to have this combined on a single view (Something crazy like select * from all_visits).

Luckily they gave me some ‘where’ clauses, and some columns they don’t need, so the final result would be 6 columns and 20% of the rows (210 million rows), but nonetheless, a ‘view’ is just a stored query. Even though the box has 60GB of ram, it’s shared with many other databases.

Options I see:

  1. Instead of a view… Creating the views as tables and move them to a dedicated box.
  2. Create one view per year?
  3. Switch all of this to mongodb or something like vertica?!
  4. Any of the previous options combined with column stored indexes?
0

There are 0 best solutions below