I'm using MySQL Workbench v8.0.36 to learn SQL and I have this table named 'Employee'
| EmpID | Empname | Gender | Salary | City | RowNum |
|---|---|---|---|---|---|
| 1 | Arjun | M | 75000 | Pune | 1 |
| 1 | Arjun | M | 75000 | Pune | 2 |
| 2 | Ekadanta | M | 125000 | Bangalore | 1 |
| 3 | Lalita | F | 150000 | Mathura | 1 |
| 4 | Madhav | M | 250000 | Delhi | 1 |
| 5 | Vishakha | F | 120000 | Mathura | 1 |
as you can see, employee no.1's data has been entered twice in this table.
Note - Column 'RowNum' isn't in the original 'Employee' table, only meant for demonstration...
I can get the desired results by simply using
create table newemp as
(select distinct * from employee);
select * from newemp;
but this 'creates' a new table instead of 'deleting' the duplicates from the original table. If I HAD to run a delete operation and delete the duplicate entry in the OG table itself, how can I do that?
So far, I tried using two basic methods but at the end whether I use
- CTE method
with cte as (
select *,
row_number() over (partition by empid order by empname) as rownum
from employee
)
delete from employee
where empid in (select empid from cte where rownum > 1);
- OR I use count(*)
delete from employee
where empid in (select empid from
(select empid from employee group by empid having count(empid) > 1)
as emp2 ) ;
IN the 'Employee' table itself, since 'EmpID' or any other column doesn't truly uniquely identify the rows, I am not able to target the duplicate entries in the WHERE clause and as a result, the query deletes any and all entries where the relevant conditions are met whereas I want to keep one and delete the rest...
AFAIK I cannot target only the duplicate entries and delete only those entries in the absence of a truly unique identifier in the original table,
is it somehow possible to use the COUNT of Empid to instruct the workbench to delete only a select number of instances of any given employee's data that has been entered multiple times?
PS- sorry if the code or table doesn't come out neatly in the post, not familiar with formatting lines of code.