merge author aliases based on institution name and publication count

40 Views Asked by At

The following author has published many papers from different institutes. I can link some of the alias names with a cartesian join. Is it possible to map all aliases to one primary author?

with t1 as 
(

select 'jones-william-j' as alias,'jones-w' as last_name_init,'The Institute Of Cancer Research, London'as institute_name,1 as pub_count
union all select 'jones-w-j','jones-w','University Of California',42
union all select 'jones-w-j','jones-w','Western University',2
union all select 'jones-willam-j','jones-w','University Of California',2
union all select 'jones-william','jones-w','University Of California',12
union all select 'jones-william-j','jones-w','University Of California',429
union all select 'jones-william-j','jones-w','University Of San Diego',1
union all select 'jones-william-j','jones-w','Va San Diego Healthcare System',2
union all select 'jones-william-j','jones-w','Western University',7

)
select * from t1;
alias last_name_init institute_name pub_count
jones-william-j jones-w The Institute Of Cancer Research, London 1
jones-w-j jones-w University Of California 42
jones-w-j jones-w Western University 2
jones-willam-j jones-w University Of California 2
jones-william jones-w University Of California 12
jones-william-j jones-w University Of California 429
jones-william-j jones-w University Of San Diego 1
jones-william-j jones-w Va San Diego Healthcare System 2
jones-william-j jones-w Western University 7

The cartesian join give the following 2 authors jones-w-1 and jones-w-2, is it possible to merge these into one author_id jones-w-1?

with t1 as 
(

select 'jones-william-j' as alias,'jones-w' as last_name_init,'The Institute Of Cancer Research, London'as institute_name,1 as pub_count
union all select 'jones-w-j','jones-w','University Of California',42
union all select 'jones-w-j','jones-w','Western University',2
union all select 'jones-willam-j','jones-w','University Of California',2
union all select 'jones-william','jones-w','University Of California',12
union all select 'jones-william-j','jones-w','University Of California',429
union all select 'jones-william-j','jones-w','University Of San Diego',1
union all select 'jones-william-j','jones-w','Va San Diego Healthcare System',2
union all select 'jones-william-j','jones-w','Western University',7

)
select distinct
     a.last_name_init||'-'||DENSE_RANK() OVER (ORDER BY a.last_name_init,a.institute_name ASC)  author_id,
a.alias,
a.last_name_init,
a.institute_name,
a.pub_count
 from t1 a,
t1 b
  where 1=1
  and a.institute_name = b.institute_name
  and a.last_name_init = b.last_name_init
  and a.alias <> b.alias;

The result is as follows.

author_id alias last_name_init institute_name pub_count
jones-w-1 jones-w-j jones-w University Of California 42
jones-w-1 jones-willam-j jones-w University Of California 2
jones-w-1 jones-william jones-w University Of California 12
jones-w-1 jones-william-j jones-w University Of California 429
jones-w-2 jones-w-j jones-w Western University 2
jones-w-2 jones-william-j jones-w Western University 7

The desired result would be.

author_id alias last_name_init institute_name pub_count
jones-w-1 jones-w-j jones-w University Of California 42
jones-w-1 jones-willam-j jones-w University Of California 2
jones-w-1 jones-william jones-w University Of California 12
jones-w-1 jones-william-j jones-w University Of California 429
jones-w-1 jones-w-j jones-w Western University 2
jones-w-1 jones-william-j jones-w Western University 7
0

There are 0 best solutions below